Yes and no. I need efficient two way lookup. So, I need to do something like For every key in {12 "a" 23 "aa" 234 "aaaa"} Get the token in a map that is organised like this: {token1 id1 token2 id2 token2 id3}
Naively, I could have two maps with the reverse lookups. l1 = {token1 id1 token2 id2 token3 id3} reverse = {id1 token1 id2 token2 ...} Given that there will be many, many tokens this seems a bit too much overhead. You could basically do a binarySearch on a sorted hash map (sorted by vals (e.g. ids)) but I thought doing the search on a vec of vectors would be more efficient and the vec of vecs representation would be more compact. I guess that's what I'm trying to figure out. Andreas P.S. Lookup by token is more frequent than lookup by id, hence the bin search on id. The sparse vec will be [id val] pairs. There is a big map of token-> id. So I can get to the id of a word vial (token-map token) If I now have a map of {id1 token1 id1 token2 ... } On 26/05/2011, at 12:00 PM, Alan Malloy wrote: > I think you've reinvented hashmaps: > > {12 "a" 23 "aa" 25 "aaa" 234 "aaaa"} > > On May 25, 6:51 pm, Andreas Kostler <andreas.koestler.le...@gmail.com> > wrote: >> Hi all, >> has anyone spent some thought on how to efficiently represent sparse vectors >> in Clojure? >> A naive scheme I came up with is using a vector of [idx val] pairs, e.g.: >> (def sparse-vec [[12 "a"][23 "aa"][25 "aaa"][234 "aaaa"]]) >> >> Accessing a value at an idx can be done so: >> (get-nth sparse-vec 25) >> => "aaa" >> >> with: >> >> (defn get-nth [sparse-vec n] >> (let [indices (map (fn [[i _]] i) sparse-vec) >> idx (java.util.Collections/binarySearch indices n) >> [_ v] (nth sparse-vec idx)] >> v)) >> >> For this to work the indices have to be in order. Can anyone think of more >> efficient (algorithmically and implementation wise) >> ways of doing this? >> >> Kind Regards >> Andreas > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en -- "Test-driven Dentistry (TDD!) - Not everything should be test driven" - Michael Fogus -- ********************************************************** Andreas Koestler, Software Engineer Leica Geosystems Pty Ltd 270 Gladstone Road, Dutton Park QLD 4102 Main: +61 7 3891 9772 Direct: +61 7 3117 8808 Fax: +61 7 3891 9336 Email: andreas.koest...@leica-geosystems.com ************www.leica-geosystems.com************* when it has to be right, Leica Geosystems Please consider the environment before printing this email. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en