On 26/05/2011, at 1:12 PM, Ken Wesson wrote:
> On Wed, May 25, 2011 at 10:22 PM, Andreas Kostler
> <[email protected]> wrote:
>> Yes and no. I need efficient two way lookup.
>>
>> So, I need to do something like
>> For every key in {12 "a" 23 "aa" 234 "aaaa"}
>> Get the token in a map that is organised like this:
>> {token1 id1 token2 id2 token2 id3}
>>
>> Naively, I could have two maps with the reverse lookups.
>> l1 = {token1 id1 token2 id2 token3 id3}
>> reverse = {id1 token1 id2 token2 ...}
>>
>> Given that there will be many, many tokens this seems a bit too much
>> overhead.
>>
>> You could basically do a binarySearch on a sorted hash map (sorted by vals
>> (e.g. ids)) but I thought
>> doing the search on a vec of vectors would be more efficient and the vec of
>> vecs representation would
>> be more compact.
>>
>> I guess that's what I'm trying to figure out.
>>
>> Andreas
>>
>> P.S. Lookup by token is more frequent than lookup by id, hence the bin
>> search on id.
>
> If the token and id sets are disjoint (no single object can ever
> appear as each -- only one of those at most) then you can use a single
> map with a function put defined as:
>
> (defn put [m k v]
> (assoc (assoc m k v) v k))
>
> which adds mappings in both directions. Then just use (m k) or (m v)
> to perform lookups in either direction. The overhead may be less than
> with two separate hashmaps, and the code is certainly simpler.
>
Hi Ken,
That indeed could be useful. What are your thoughts on the proposed solution of
representing
sparse vectors as maps (id-> val)? Having a vec of vector or a map with ids and
vals vectors
{:ids [vec of ids] :vals [vec of val]} is just a lot easier to deal with for
calculations. E.g normalisation
;; For a sparse vector as map
(defn normalise-m [m]
(let [vals (vals m)
keys (keys m)
l (Math/sqrt (reduce + (map #(* % %) vals)))]
(zipmap keys (map #(/ % l) vals))))
;; For a map of id/val vecs
(defn normalise [sv]
(let [l (Math/sqrt (reduce + (map #(* % %) (:vals sv))))]
{:i (:i sv) :vals (map #(/ % l) (:vals sv))}))
The latter outperforms the former by more than a factor of 10.
Has anyone used sparse vectors or think of a reasonable implementation?
> --
> Protege: What is this seething mass of parentheses?!
> Master: Your father's Lisp REPL. This is the language of a true
> hacker. Not as clumsy or random as C++; a language for a more
> civilized age.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en