Maybe it would help to make your problem statement a litte rigorous so we can get a clearer idea of whats required.
One possible formulation: Given a list L of pairs of values, weightings: [ (v_0, w_0), (v_1, w_1), ....], and some N between 1 and length(L) you would like to randomly select a set of N (distinct) values, V, such that for any ints i and j, Prob (v_i is in V) / Prob (v_j is in V) = w_i / w_j This matches your expectations for N = 1. Intuitively though, without having put much thought into it, I suspect this might not be possible in the general case. You might then want to (substantially) relax thec ondition to Prob (v_i is in V) >= Prob (v_j is in V) iff w_i >= w_j but in that case its more an ordering of likelihoods rather than a weighting, and doesn't guarantee the right behaviour for N = 1, so i don't think thats really what you want. I can't think of any other obvious way of generalising the behaviour of the N = 1 case. - Jordan On Nov 17, 10:50 am, Bruza <[EMAIL PROTECTED]> wrote: > On Nov 16, 4:47 pm, Bruza <[EMAIL PROTECTED]> wrote: > > > > > > > On Nov 16, 6:58 am, duncan smith <[EMAIL PROTECTED]> > > wrote: > > > > Bruza wrote: > > > > I need to implement a "random selection" algorithm which takes a list > > > > of [(obj, prob),...] as input. Each of the (obj, prob) represents how > > > > likely an object, "obj", should be selected based on its probability > > > > of > > > > "prob".To simplify the problem, assuming "prob" are integers, and the > > > > sum of all "prob" equals 100. For example, > > > > > items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)] > > > > > The algorithm will take a number "N", and a [(obj, prob),...] list as > > > > inputs, and randomly pick "N" objects based on the probabilities of > > > > the > > > > objects in the list. > > > > > For N=1 this is pretty simply; the following code is sufficient to do > > > > the job. > > > > > def foo(items): > > > > index = random.randint(0, 99) > > > > currentP = 0 > > > > for (obj, p) in items: > > > > currentP += w > > > > if currentP > index: > > > > return obj > > > > > But how about the general case, for N > 1 and N < len(items)? Is there > > > > some clever algorithm using Python standard "random" package to do > > > > the trick? > > > > I think you need to clarify what you want to do. The "probs" are > > > clearly not probabilities. Are they counts of items? Are you then > > > sampling without replacement? When you say N < len(items) do you mean N > > > <= sum of the "probs"? > > > > Duncabn > > > I think I need to explain on the probability part: the "prob" is a > > relative likelihood that the object will be included in the output > > list. So, in my example input of > > > items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)] > > > So, for any size of N, 'Tom' (with prob of 45) will be more likely to > > be included in the output list of N distinct member than 'Mary' (prob > > of 30) and much more likely than that of 'John' (with prob of 10). > > > I know "prob" is not exactly the "probability" in the context of > > returning a multiple member list. But what I want is a way to "favor" > > some member in a selection process. > > > So far, only Boris's solution is closest (but not quite) to what I > > need, which returns a list of N distinct object from the input > > "items". However, I tried with input of > > > items = [('Mary',1), ('John', 1), ('Tom', 1), ('Jane', 97)] > > > and have a repeated calling of > > > Ben > > OOPS. I pressed the Send too fast. > > The problem w/ Boris's solution is that after repeated calling of > randomPick(3,items), 'Jane' is not the most "frequent appearing" > member in all the out list of 3 member lists...- Hide quoted text - > > - Show quoted text - -- http://mail.python.org/mailman/listinfo/python-list