This is what I've come up with so far...it should give a bit of an idea of what
I'm using it for:
(defprotocol PSparseVec
(norm [this])
(cnt [this])
(ith-val [this i])
(unit-vec [this])
(assc [this i o]))
(defrecord SparseVec [indices vals]
PSparseVec
(cnt [this] (count indices))
(norm [this] (Math/sqrt (reduce + (map #(* % %) vals))))
(unit-vec [this] (let [l-inv (/ 1 (norm this))] (SparseVec. indices (vec (map
#(* l-inv %) vals)))))
(assc [this i o]
(let [idx (bin-search indices i)
val (get vals idx nil)]
(if val (SparseVec. indices (assoc vals idx o))
(let [new-indices (sort (conj indices i))
new-idx (bin-search new-indices i)
[chunk1 chunk2] (split-at new-idx vals)]
(SparseVec. new-indices (into (conj (vec chunk1) o) chunk2))))))
(ith-val [this i]
(get vals (bin-search indices i) 0)))
Can you guys think of ways of making this more idiomatic and/or performant?
Cheers
Andreas
On 26/05/2011, at 1:12 PM, Ken Wesson wrote:
> On Wed, May 25, 2011 at 10:22 PM, Andreas Kostler
> <[email protected]> wrote:
>> Yes and no. I need efficient two way lookup.
>>
>> So, I need to do something like
>> For every key in {12 "a" 23 "aa" 234 "aaaa"}
>> Get the token in a map that is organised like this:
>> {token1 id1 token2 id2 token2 id3}
>>
>> Naively, I could have two maps with the reverse lookups.
>> l1 = {token1 id1 token2 id2 token3 id3}
>> reverse = {id1 token1 id2 token2 ...}
>>
>> Given that there will be many, many tokens this seems a bit too much
>> overhead.
>>
>> You could basically do a binarySearch on a sorted hash map (sorted by vals
>> (e.g. ids)) but I thought
>> doing the search on a vec of vectors would be more efficient and the vec of
>> vecs representation would
>> be more compact.
>>
>> I guess that's what I'm trying to figure out.
>>
>> Andreas
>>
>> P.S. Lookup by token is more frequent than lookup by id, hence the bin
>> search on id.
>
> If the token and id sets are disjoint (no single object can ever
> appear as each -- only one of those at most) then you can use a single
> map with a function put defined as:
>
> (defn put [m k v]
> (assoc (assoc m k v) v k))
>
> which adds mappings in both directions. Then just use (m k) or (m v)
> to perform lookups in either direction. The overhead may be less than
> with two separate hashmaps, and the code is certainly simpler.
>
> --
> Protege: What is this seething mass of parentheses?!
> Master: Your father's Lisp REPL. This is the language of a true
> hacker. Not as clumsy or random as C++; a language for a more
> civilized age.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to [email protected]
> Note that posts from new members are moderated - please be patient with your
> first post.
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en