Re: The unshared part of two mostly-shared structures

rapido Sat, 21 Mar 2009 15:51:17 -0700

i have done some homework on this problem :)

(warning: not clojure specific)


my claim is that diffing two similar sets (or maps) *can* be made
efficient only *if* you can add arbitrary items to sets efficiently
(i.e. O(log(n)).

remember that immutable sets are usually laid out in a tree structure.
this also means that similar sets share a lot of tree structure.

if pointer equality is allowed, the differencing of two sets can be
made O(n) where n is the number of differences.
this can be achieved by comparing (on equality) left and right (sub)
trees of two sets.

more strongly, if the (sorted) differences between two sets are
consecutive, it can be made O(log n).
and more generally, if there are (m) blocks of consecutive (n)
differences, you can get it O(m*log(n)))
(google search 'treaps')

pointer equality doesn't always cut it however:
when two disparate processes build two similar sets (or trees), their
pointers will be different and pointer equality fails.

this can be solved by replacing pointers with hashes (of recursive
tree content), effectively replacing pointer equality by hash
equality.

my programming language enchilada (www.enchiladacode.nl) has
hash=pointer equality build in.
i believe it shouldn't be to difficult to introduce some of
enchilada's internals to clojure.

- robbert.



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Re: The unshared part of two mostly-shared structures

Reply via email to