There seems to be convergence among MVCC DB systems (except those requiring address stability) to the "update-in-place with undo log" approach to version storage. Why not use this approach to make any standalone data structure multiversioned? Just take any update-in-place structure and add an append-only undo log. The primary structure stores current versions and the undo log stores previous versions. Each entry in either the primary structure or the undo log has a pointer (or log offset) to the previous version in the undo log, so the primary structure effectively holds a linked list of versions for each entry. Because the undo log is totally ordered by update recency, it can easily be truncated at a particular LSN or timestamp (e.g., the timestamp of the oldest active transaction in a database). For a concurrent primary structure, the undo log itself can provide a total order on operations (e.g., via fetch-and-add on the next log offset), which may have additional applications besides version management (e.g., in replication).
=> More informations about this toot | More toots from tobinbaker@discuss.systems
@tobinbaker There's also the "fat node" approach where each node can contain multiple versions (at least 2) so you can amortize the O(log n) path copy to O(1) for isolated updates. You don't need to overwrite the existing version in place for that.
=> More informations about this toot | More toots from pervognsen@mastodon.social
@pervognsen Right, I haven't really worked out the pros/cons of this approach WRT traditional persistent data structures (either imperative or purely functional). It's clearly heavily optimized (just like MVCC databases) for queries over the latest versions.
=> More informations about this toot | More toots from tobinbaker@discuss.systems This content has been proxied by September (3851b).Proxy Information
text/gemini