On Jan 5, 5:36 pm, [EMAIL PROTECTED] wrote: > On Jan 5, 9:50 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > > > > On Jan 5, 5:12 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > > > On Jan 5, 4:14 pm, [EMAIL PROTECTED] wrote: > > > > > On Jan 5, 5:07 pm, [EMAIL PROTECTED] wrote: > > > > > > Hello, Paul and Arnaud. > > > > > While I think about your answers: do you think there is any way to > > > > > avoid shuffle? > > > > > It may take unnecessary long on a long list most of whose elements > > > > > have the property. > > > > You could generate a random shuffle of range(len(seq)) lazily, and use > > > that to iterate over your sequence. > > > > import random > > > import itertools > > > > def randxrange(n): > > > "Generate the numbers 0 to n-1 in a random order" > > > x = range(n) > > > for i in xrange(n): > > > r = random.randrange(n - i) > > > yield x[r] > > > x[r] = x[n - i - 1] > > > > def shuffled(seq): > > > "Generate the elements of seq in a random order" > > > return (seq[i] for i in randxrange(len(seq))) > > > > def pick_random(seq, prop): > > > return itertools.ifilter(prop, shuffled(seq)).next() > > > At the risk of filling this thread with my posts, here's a much > > simplified and faster version of this code that uses no extra storage. > > > import random > > > def pick_random(seq, prop): > > L = len(seq) > > for i in xrange(L): > > r = random.randrange(L - i) > > if prop(seq[r]): > > return seq[r] > > seq[r] = seq[L - i - 1] > > > -- > > Paul Hankin > > This one is good. Someone commented that you destroy the list, but > that can be fixed: > > def pick_random(seq, prop): > L = len(seq) > for i in xrange(L): > r = random.randrange(L - i) > if prop(seq[r]): > return seq[r] > seq[r], seq[L - i - 1]= seq[L - i - 1],seq[r] > > just pushing elements without the property to the end of the list > (list is mutated, true, but shuffle will do that too). In each > iteration of the for loop, there is only one random element, one check > of the property, and rebinding of elements without altering the lenght > of the list. This is optimal and has all the functionality. > > Two more comments: > for buzcor: deleting an element in the middle of a list is costly > for martin: that is certainly enough for me > > Thanks everybody!
Just for fun, I profiled my answer versus the final answer over 300 seconds. I wrapped some parts of the functions in trivial functions so they would show up in the profiling. I found that my answer was 50% slower, but not because of the delete, but mostly the len() inside the loop, and the bool(seq) at the start of each loop. I was able to rewrite mine to only be 14 seconds slower (the time to make the copy at the beginning), but don't believe that's useful enough to post. ncalls tottime percall cumtime percall filename:lineno(function) 1516221 48.545 0.000 158.850 0.000 picked.py: 14(pick_random_bukzor) 3066421 19.359 0.000 53.815 0.000 picked.py:8(randrange2) 3066421 16.852 0.000 24.751 0.000 picked.py:9(len2) 1516221 14.712 0.000 14.712 0.000 picked.py:12(make_copy) 3066421 9.767 0.000 9.767 0.000 picked.py:10(notempty) 1550200 7.260 0.000 7.260 0.000 picked.py:7(delete) 1516221 31.574 0.000 104.384 0.000 picked.py: 27(pick_random) 3063737 19.556 0.000 54.617 0.000 picked.py:25(randrange3) 1516221 8.485 0.000 12.582 0.000 picked.py:26(len3) 1547516 5.611 0.000 5.611 0.000 picked.py: 24(do_transform) def delete(list, index): del list[index] def randrange2(X): return randrange(X) def len2(seq): return len(seq) def notempty(seq): if seq: return True def make_copy(l): return list(l) def pick_random_bukzor(seq, prop=bool): temp = make_copy(seq) while notempty(seq): i = randrange2(len2(temp)) if prop(temp[i]): return temp[i] else: delete(temp, i) def do_transform(seq, L, r, i): seq[r], seq[L - i - 1]= seq[L - i - 1],seq[r] def randrange3(X): return randrange(X) def len3(seq): return len(seq) def pick_random(seq, prop=bool): L = len3(seq) for i in xrange(L): r = randrange3(L - i) if prop(seq[r]): return seq[r] do_transform(seq, L, r, i) -- http://mail.python.org/mailman/listinfo/python-list