This is what I've come up with so far...it should give a bit of an idea of what I'm using it for:
(defprotocol PSparseVec (norm [this]) (cnt [this]) (ith-val [this i]) (unit-vec [this]) (assc [this i o])) (defrecord SparseVec [indices vals] PSparseVec (cnt [this] (count indices)) (norm [this] (Math/sqrt (reduce + (map #(* % %) vals)))) (unit-vec [this] (let [l-inv (/ 1 (norm this))] (SparseVec. indices (vec (map #(* l-inv %) vals))))) (assc [this i o] (let [idx (bin-search indices i) val (get vals idx nil)] (if val (SparseVec. indices (assoc vals idx o)) (let [new-indices (sort (conj indices i)) new-idx (bin-search new-indices i) [chunk1 chunk2] (split-at new-idx vals)] (SparseVec. new-indices (into (conj (vec chunk1) o) chunk2)))))) (ith-val [this i] (get vals (bin-search indices i) 0))) Can you guys think of ways of making this more idiomatic and/or performant? Cheers Andreas On 26/05/2011, at 1:12 PM, Ken Wesson wrote: > On Wed, May 25, 2011 at 10:22 PM, Andreas Kostler > <andreas.koestler.le...@gmail.com> wrote: >> Yes and no. I need efficient two way lookup. >> >> So, I need to do something like >> For every key in {12 "a" 23 "aa" 234 "aaaa"} >> Get the token in a map that is organised like this: >> {token1 id1 token2 id2 token2 id3} >> >> Naively, I could have two maps with the reverse lookups. >> l1 = {token1 id1 token2 id2 token3 id3} >> reverse = {id1 token1 id2 token2 ...} >> >> Given that there will be many, many tokens this seems a bit too much >> overhead. >> >> You could basically do a binarySearch on a sorted hash map (sorted by vals >> (e.g. ids)) but I thought >> doing the search on a vec of vectors would be more efficient and the vec of >> vecs representation would >> be more compact. >> >> I guess that's what I'm trying to figure out. >> >> Andreas >> >> P.S. Lookup by token is more frequent than lookup by id, hence the bin >> search on id. > > If the token and id sets are disjoint (no single object can ever > appear as each -- only one of those at most) then you can use a single > map with a function put defined as: > > (defn put [m k v] > (assoc (assoc m k v) v k)) > > which adds mappings in both directions. Then just use (m k) or (m v) > to perform lookups in either direction. The overhead may be less than > with two separate hashmaps, and the code is certainly simpler. > > -- > Protege: What is this seething mass of parentheses?! > Master: Your father's Lisp REPL. This is the language of a true > hacker. Not as clumsy or random as C++; a language for a more > civilized age. > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en