Hi, I need to build a data structure to store aliases of arbitrary values (e.g. URIs), basically an undirected graph. Because I need a fast lookup, I thought using a map with bi-directional mappings could work well, e.g. stating "X" sameas "A" becomes this:
;; x == a : {a #{x} x #{a}} Continuing further... ;; a == b : {a #{b x} b #{a} x #{a}} ;; c == y : {a #{b x} b #{a} c #{y} x #{a} y #{c}} ;; y == b : {a #{b x} b #{a y} c #{y] x #{a} y #{b c}} I've written the functions below and they produce the correct result: (def ->set (fnil conj #{})) (defn register-same-as--map [aliases x y] (-> aliases (update-in [x] ->set y) (update-in [y] ->set x))) (defn same-as?--map "Takes a map of same-as rels and two keys, returns last if both keys are considered equal." ([aliases x y] (or (= x y) (same-as?--map aliases [x] y #{}))) ([aliases [x & queue] y visited] (or (get-in aliases [x y]) (when-let [q' (->> x aliases (filter (complement visited)) (into queue) set seq)] (recur aliases q' y (conj visited x)))))) (time (dotimes [i 100000] (def am (reduce (fn [a [x y]] (register-same-as--map a x y)) {} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]])))) ;; "Elapsed time: 1976.503 msecs" (same-as?--map am 'x 'y) ;; y (same-as?--map am 'a 'c) ;; c (same-as?--map am 'x 'g) ;; nil (time (dotimes [i 100000] (same-as?--map am 'x 'y))) ;; "Elapsed time: 353.338 msecs" Since in most of my use cases I won't have to recurse over more than a few edges per lookup this approach should work okay. However, I was still interested in other more compact/faster options. For example, provided there're aren't too many distinct sets of unrelated nodes, using sets like below would provide (~4x) faster lookups, however am not sure if that's really the best way merging those sets efficiently whenever needed and new edges are added. I guess another level of branching in `register-same-as` would help to avoid the merging, but are there any fundamentally better approaches than those two listed? ;; x->a #{#{x a}} ;; x->b #{#{x a b}} ;; c->y #{#{x a b} #{c y}} ;; y->b #{#{x a b c y}} ; merge (defn register-same-as--set [aliases x y] (let [matches (filter #(or (% x) (% y)) aliases)] (if (seq matches) (let [a' (reduce disj aliases matches) m' (reduce #(into % %2) matches)] (conj a' (conj m' x y))) (conj aliases #{x y})))) (defn same-as?--set [aliases x y] (some #(and (% x) (% y)) aliases)) (time (dotimes [i 100000] (def as (reduce (fn [a [x y]] (register-same-as--set a x y)) #{} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]])))) ;; "Elapsed time: 2308.634 msecs" (time (dotimes [i 100000] (same-as?--set as 'x 'y))) ;; "Elapsed time: 89.197 msecs" Thoughts? & Thanks! :) -- Karsten Schmidt http://postspectacular.com | http://toxiclibs.org | http://toxi.co.uk -- -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.