By storing the set in a binary (or n-ary) tree and maintaining the total weight in a node you can make every operation (draw, insert, union, intersect) O(log n) and total-weight O(1). Draw like this (where node(w,l,r) denotes a node with weight w, left subtree l and right subtree r and leaf(w,v) denotes a leaf with weight w and value v):
draw(node) = search(random number between 0 and weight(node), node) search(x,node(w,l,r)) = if x < weight(l) then search(x,l) else search(x-weight(l),r) search(x,leaf(w,v)) = v To make this work you need to maintain the invariant w = weight(node(w,l,r)) = weight(l) + weight(r) in the set operations. You can use the same idea for hash tries, just maintain the total weight in each node and draw with this algorithm. On 26 jun, 15:54, Nicolas Oury <nicolas.o...@gmail.com> wrote: > Thank you very much for your answer. I forgot to tell you that I also needs > to be able to have usual set operations. > > They become linear with your 1st idea. (But, sorry, it's my bad! I should > have asked the right question. Your answer is very good for the question I > actually asked, not the one I had in mind :)) > The second idea is also good, but would perform a bit less good than using > wider trees than binary. > (I might use this second idea, though, with a set instead of a vector, if I > can't manage to make HashTries work). > > > > On Sat, Jun 26, 2010 at 2:50 PM, Jules <julesjac...@gmail.com> wrote: > > You can store the numbers in a vector [2 5 3]. Instead of doing this > > store the cumulative weight v = [0 2 7 10]. What your problem comes > > down to is given a number n, find the the lowest index i such that > > v[i] <= n. This can be accomplished with binary search. You can also > > use a binary search tree instead of a vector. You also keep a vector w > > with the actual objects. Drawing is done like this: generate a random > > number n between 0 and total-weight = (last v). Then find the smallest > > index i such that v[i] <= n. The drawn object is w[i]. Hope this > > helps. > > > On 26 jun, 15:01, Nicolas Oury <nicolas.o...@gmail.com> wrote: > > > Dear all, > > > > for a project, I need a data structure to represent finite distribution, > > > that is a set of elements, each of which having a mass in the set, and > > two > > > functions: > > > > - total-mass : the sum of the masses of each element > > > - draw : return an element with a probability proportional to its mass > > > > I currently use the simple idea : a Persistent Hash Map from elt to mass: > > > - the total-mass is the reduce of (+ . second) > > > - drawing is choosing a random-number between 0 and the total-mass and > > > looping through the seq, to look for the corresponding element. > > > > Both operation are linear in the number of elements. This is a problem, > > > because these sets tend to be quite big in my project. > > > > I think that if I used a HashTrie with total-mass cached in each node, > > > total-mass would be in constant time and draw in log time. > > > > So here is my question : does anyone has, for example for the "clojure in > > > clojure" project, a sketch of an implementation of PersistentHashMap > > > in clojure itself? (I would be happy to contribute, if it needs more > > work). > > > Or has anyone a better idea that mine to have drawable sets? > > > > Thanks for your help, > > > > Best regards, > > > > Nicolas. > > > -- > > You received this message because you are subscribed to the Google > > Groups "Clojure" group. > > To post to this group, send email to clojure@googlegroups.com > > Note that posts from new members are moderated - please be patient with > > your first post. > > To unsubscribe from this group, send email to > > clojure+unsubscr...@googlegroups.com<clojure%2bunsubscr...@googlegroups.com > > > > > For more options, visit this group at > >http://groups.google.com/group/clojure?hl=en -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en