On Nov 16, 6:58 am, duncan smith <[EMAIL PROTECTED]> wrote: > Bruza wrote: > > I need to implement a "random selection" algorithm which takes a list > > of [(obj, prob),...] as input. Each of the (obj, prob) represents how > > likely an object, "obj", should be selected based on its probability > > of > > "prob".To simplify the problem, assuming "prob" are integers, and the > > sum of all "prob" equals 100. For example, > > > items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)] > > > The algorithm will take a number "N", and a [(obj, prob),...] list as > > inputs, and randomly pick "N" objects based on the probabilities of > > the > > objects in the list. > > > For N=1 this is pretty simply; the following code is sufficient to do > > the job. > > > def foo(items): > > index = random.randint(0, 99) > > currentP = 0 > > for (obj, p) in items: > > currentP += w > > if currentP > index: > > return obj > > > But how about the general case, for N > 1 and N < len(items)? Is there > > some clever algorithm using Python standard "random" package to do > > the trick? > > I think you need to clarify what you want to do. The "probs" are > clearly not probabilities. Are they counts of items? Are you then > sampling without replacement? When you say N < len(items) do you mean N > <= sum of the "probs"? > > Duncabn
I think I need to explain on the probability part: the "prob" is a relative likelihood that the object will be included in the output list. So, in my example input of items = [('Mary',30), ('John', 10), ('Tom', 45), ('Jane', 15)] So, for any size of N, 'Tom' (with prob of 45) will be more likely to be included in the output list of N distinct member than 'Mary' (prob of 30) and much more likely than that of 'John' (with prob of 10). I know "prob" is not exactly the "probability" in the context of returning a multiple member list. But what I want is a way to "favor" some member in a selection process. So far, only Boris's solution is closest (but not quite) to what I need, which returns a list of N distinct object from the input "items". However, I tried with input of items = [('Mary',1), ('John', 1), ('Tom', 1), ('Jane', 97)] and have a repeated calling of Ben -- http://mail.python.org/mailman/listinfo/python-list