By storing the set in a binary (or n-ary) tree and maintaining the
total weight in a node you can make every operation (draw, insert,
union, intersect) O(log n) and total-weight O(1). Draw like this
(where node(w,l,r) denotes a node with weight w, left subtree l and
right subtree r and leaf(w,v) denotes a leaf with weight w and value
v):

draw(node) = search(random number between 0 and weight(node), node)
search(x,node(w,l,r)) = if x < weight(l) then search(x,l) else
search(x-weight(l),r)
search(x,leaf(w,v)) = v

To make this work you need to maintain the invariant w =
weight(node(w,l,r)) = weight(l) + weight(r) in the set operations.

You can use the same idea for hash tries, just maintain the total
weight in each node and draw with this algorithm.

On 26 jun, 15:54, Nicolas Oury <nicolas.o...@gmail.com> wrote:
> Thank you very much for your answer. I forgot to tell you  that I also needs
> to be able to have usual set operations.
>
> They become linear with your 1st idea. (But, sorry, it's my bad! I should
> have asked the right question. Your answer is very good for the question I
> actually asked, not the one I had in mind :))
> The second idea is also good, but would perform a bit less good than using
> wider trees than binary.
> (I might use this second idea, though, with a set instead of a vector, if I
> can't manage to make HashTries work).
>
>
>
> On Sat, Jun 26, 2010 at 2:50 PM, Jules <julesjac...@gmail.com> wrote:
> > You can store the numbers in a vector [2 5 3]. Instead of doing this
> > store the cumulative weight v = [0 2 7 10]. What your problem comes
> > down to is given a number n, find the the lowest index i such that
> > v[i] <= n. This can be accomplished with binary search. You can also
> > use a binary search tree instead of a vector. You also keep a vector w
> > with the actual objects. Drawing is done like this: generate a random
> > number n between 0 and total-weight = (last v). Then find the smallest
> > index i such that v[i] <= n. The drawn object is w[i]. Hope this
> > helps.
>
> > On 26 jun, 15:01, Nicolas Oury <nicolas.o...@gmail.com> wrote:
> > > Dear all,
>
> > > for a project, I need a data structure to represent finite distribution,
> > > that is a set of elements, each of which having a mass in the set, and
> > two
> > > functions:
>
> > > - total-mass : the sum of the masses of each element
> > > - draw : return an element with a probability proportional to its mass
>
> > > I currently use the simple idea : a Persistent Hash Map from elt to mass:
> > > - the total-mass is the reduce of (+ . second)
> > > - drawing is choosing a random-number between 0 and the total-mass and
> > > looping through the seq, to look for the corresponding element.
>
> > > Both operation are linear in the number of elements. This is a problem,
> > > because these sets tend to be quite big in my project.
>
> > > I think that if I used a HashTrie with total-mass cached in each node,
> > > total-mass would be in constant time and draw in log time.
>
> > > So here is my question : does anyone has, for example for the "clojure in
> > > clojure" project, a sketch of an implementation of PersistentHashMap
> > > in clojure itself? (I would be happy to contribute, if it needs more
> > work).
> > > Or has anyone a better idea that mine to have drawable sets?
>
> > > Thanks for your help,
>
> > > Best regards,
>
> > > Nicolas.
>
> > --
> > You received this message because you are subscribed to the Google
> > Groups "Clojure" group.
> > To post to this group, send email to clojure@googlegroups.com
> > Note that posts from new members are moderated - please be patient with
> > your first post.
> > To unsubscribe from this group, send email to
> > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com 
> > >
> > For more options, visit this group at
> >http://groups.google.com/group/clojure?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Reply via email to