Hi, I need to build a data structure to store aliases of arbitrary
values (e.g. URIs), basically an undirected graph. Because I need a
fast lookup, I thought using a map with bi-directional mappings could
work well, e.g. stating "X" sameas "A" becomes this:

;; x == a : {a #{x} x #{a}}

Continuing further...

;; a == b : {a #{b x} b #{a} x #{a}}
;; c == y : {a #{b x} b #{a} c #{y} x #{a} y #{c}}
;; y == b : {a #{b x} b #{a y} c #{y] x #{a} y #{b c}}

I've written the functions below and they produce the correct result:

(def ->set (fnil conj #{}))

(defn register-same-as--map
  [aliases x y]
  (-> aliases
      (update-in [x] ->set y)
      (update-in [y] ->set x)))

(defn same-as?--map
  "Takes a map of same-as rels and two keys, returns last if both keys
  are considered equal."
  ([aliases x y] (or (= x y) (same-as?--map aliases [x] y #{})))
  ([aliases [x & queue] y visited]
     (or
      (get-in aliases [x y])
      (when-let [q' (->> x
                         aliases
                         (filter (complement visited))
                         (into queue)
                         set
                         seq)]
        (recur aliases q' y (conj visited x))))))

(time
 (dotimes [i 100000]
   (def am
     (reduce
      (fn [a [x y]] (register-same-as--map a x y))
      {} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]]))))
;; "Elapsed time: 1976.503 msecs"

(same-as?--map am 'x 'y)
;; y
(same-as?--map am 'a 'c)
;; c
(same-as?--map am 'x 'g)
;; nil

(time (dotimes [i 100000] (same-as?--map am 'x 'y)))
;; "Elapsed time: 353.338 msecs"

Since in most of my use cases I won't have to recurse over more than a
few edges per lookup this approach should work okay. However, I was
still interested in other more compact/faster options. For example,
provided there're aren't too many distinct sets of unrelated nodes,
using sets like below would provide (~4x) faster lookups, however am
not sure if that's really the best way merging those sets efficiently
whenever needed and new edges are added. I guess another level of
branching in `register-same-as` would help to avoid the merging, but
are there any fundamentally better approaches than those two listed?

;; x->a #{#{x a}}
;; x->b #{#{x a b}}
;; c->y #{#{x a b} #{c y}}
;; y->b #{#{x a b c y}} ; merge

(defn register-same-as--set
  [aliases x y]
  (let [matches (filter #(or (% x) (% y)) aliases)]
    (if (seq matches)
      (let [a' (reduce disj aliases matches)
            m' (reduce #(into % %2) matches)]
        (conj a' (conj m' x y)))
      (conj aliases #{x y}))))

(defn same-as?--set
  [aliases x y] (some #(and (% x) (% y)) aliases))

(time
 (dotimes [i 100000]
   (def as
     (reduce
      (fn [a [x y]] (register-same-as--set a x y))
      #{} '[[x d] [a b] [b d] [e f] [d y] [c y] [y b] [g e]]))))
;; "Elapsed time: 2308.634 msecs"

(time (dotimes [i 100000] (same-as?--set as 'x 'y)))
;; "Elapsed time: 89.197 msecs"

Thoughts? & Thanks! :)

-- 
Karsten Schmidt
http://postspectacular.com | http://toxiclibs.org | http://toxi.co.uk

-- 
-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to