Yes and no. I need efficient two way lookup. 

So, I need to do something like
For every key in {12 "a" 23 "aa" 234 "aaaa"}
Get the token in a map that is organised like this:
{token1 id1 token2 id2 token2 id3}

Naively, I could have two maps with the reverse lookups.
l1 = {token1 id1 token2 id2 token3 id3}
reverse = {id1 token1 id2 token2 ...}

Given that there will be many, many tokens this seems a bit too much overhead.

You could basically do a binarySearch on a sorted hash map (sorted by vals 
(e.g. ids)) but I thought
doing the search on a vec of vectors would be more efficient and the vec of 
vecs representation would
be more compact.

I guess that's what I'm trying to figure out.

Andreas

P.S. Lookup by token is more frequent than lookup by id, hence the bin search 
on id.




The sparse vec will be [id val] pairs.
There is a big map of token-> id. So I can get to the id of a word vial 
(token-map token)
If I now have a map of 
{id1 token1 id1 token2 ... } 
On 26/05/2011, at 12:00 PM, Alan Malloy wrote:

> I think you've reinvented hashmaps:
> 
> {12 "a" 23 "aa" 25 "aaa" 234 "aaaa"}
> 
> On May 25, 6:51 pm, Andreas Kostler <andreas.koestler.le...@gmail.com>
> wrote:
>> Hi all,
>> has anyone spent some thought on how to efficiently represent sparse vectors 
>> in Clojure?
>> A naive scheme I came up with is using a vector of [idx val] pairs, e.g.:
>> (def sparse-vec [[12 "a"][23 "aa"][25 "aaa"][234 "aaaa"]])
>> 
>> Accessing a value at an idx can be done so:
>> (get-nth sparse-vec 25)
>> => "aaa"
>> 
>> with:
>> 
>> (defn get-nth [sparse-vec n]
>>         (let [indices (map (fn [[i _]] i) sparse-vec)
>>              idx (java.util.Collections/binarySearch indices n)
>>              [_ v] (nth sparse-vec idx)]
>>         v))
>> 
>> For this to work the indices have to be in order. Can anyone think of more 
>> efficient (algorithmically and implementation wise)
>> ways of doing this?
>> 
>> Kind Regards
>> Andreas
> 
> -- 
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clojure@googlegroups.com
> Note that posts from new members are moderated - please be patient with your 
> first post.
> To unsubscribe from this group, send email to
> clojure+unsubscr...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
"Test-driven Dentistry (TDD!) - Not everything should be test driven"
- Michael Fogus
-- 
**********************************************************
Andreas Koestler, Software Engineer
Leica Geosystems Pty Ltd
270 Gladstone Road, Dutton Park QLD 4102
Main: +61 7 3891 9772     Direct: +61 7 3117 8808
Fax: +61 7 3891 9336
Email: andreas.koest...@leica-geosystems.com

************www.leica-geosystems.com*************

when it has to be right, Leica Geosystems

Please  consider the environment before printing this email.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to