On Jan 4, 2008 2:55 PM, <[EMAIL PROTECTED]> wrote: > Hello, > This is a question for the best method (in terms of performance > only) to choose a random element from a list among those that satisfy > a certain property.
I would automatically use random.choice(filter(pred_func, a_list)). You just have to catch the possible IndexError. This is the setting: I need to pick from a list a random element > that satisfies a given property. All or none of the elements may have > the property. Most of the time, many of the elements will satisfy the > property, and the property is a bit expensive to evaluate. Chance of > having the property are uniform among elements. > > A simple approach is: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > > Returns None if no element has the property > ''' > random.shuffle(a_list) > for i in a_list: > if property(i): return i I'm pretty sure you don't want to use a destructive random_pick function. You'll have to shuffle a copy instead to avoid that problem. > > but that requires to shuffle the list every time. > > A second approach, that works if we know that at least one element of > the list has the property, is: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > > Loops forever if no element has the property > ''' > while 1: > i=random.choice(a_list) > if property(i): return i > > which is more efficient (on average) if many elements of the list have > the property and less efficient if only few elements of the list has > the property (and goes crazy if no element has the property) That's awful. Here's another linear time idea, returning the nearest element that satisfies the predicate. offset = random.randrange(len(a_list)) for n in xrange(len(a_list)): ix = (offset + n) % len(a_list) if predicate(a_list[ix]): return a_list[ix] raise ValueError('no element has the property') The possible problem is that large strings of elements in a row that don't match the predicate greatly increase the odds of getting the following element that *does* match the predicate. Worst case is two predicate matched elements in a row, surrounded by a bunch of non-matched elements. > Yet another one: > > import random > def random_pick(a_list,property): > '''Returns a random element from a list that has the property > ''' > b_list=[x for x in a_list if property(x)] > try: > return random.choice(b_list) > finally: return None > > but this one checks the property on all the elements, which is no > good. Is it really expensive to check the property? That would mitigate against the filter solution and for the other one I posted. This seems to be a case of trying to solve a data problem functionally. It'd be better to store your data differently if this will be a frequent operation and you simply can't afford to call the predicate on all the elements. Incidentally, try not to shadow builtin names like 'property'. -- Neil Cerutti
-- http://mail.python.org/mailman/listinfo/python-list