fastest method to choose a random element
Hello, This is a question for the best method (in terms of performance only) to choose a random element from a list among those that satisfy a certain property. This is the setting: I need to pick from a list a random element that satisfies a given property. All or none of the elements may have the property. Most of the time, many of the elements will satisfy the property, and the property is a bit expensive to evaluate. Chance of having the property are uniform among elements. A simple approach is: import random def random_pick(a_list,property): '''Returns a random element from a list that has the property Returns None if no element has the property ''' random.shuffle(a_list) for i in a_list: if property(i): return i but that requires to shuffle the list every time. A second approach, that works if we know that at least one element of the list has the property, is: import random def random_pick(a_list,property): '''Returns a random element from a list that has the property Loops forever if no element has the property ''' while 1: i=random.choice(a_list) if property(i): return i which is more efficient (on average) if many elements of the list have the property and less efficient if only few elements of the list has the property (and goes crazy if no element has the property) Yet another one: import random def random_pick(a_list,property): '''Returns a random element from a list that has the property ''' b_list=[x for x in a_list if property(x)] try: return random.choice(b_list) finally: return None but this one checks the property on all the elements, which is no good. I don't need strong random numbers, so a simple solution like: import random globalRNG=random.Random() def random_pick(a_list,property): '''Returns a random element from a list that has the property Works only if len(a_list)+1 is prime: uses Fermat's little theorem ''' a=globalRNG(1,len(a_list)) ind=a for i in xrange(len(a_list)): x=a_list[a-1] if property(x):return x ind*=a but this works only if len(a_list)+1 is prime!!! Now this one could be saved if there were an efficient method to find a prime number given than a number n but not very much greater... Any other ideas? Thanks everybody -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest method to choose a random element
> Caching might help. > > If random_pick is called several times with the same list(s) then > cache the result of > [property(i) for i in a_list] against a_list. > > If random_pick is called several times with list(s) with multiple > instances of list items then cache > property(i) against i for i in a_list . > > You could do both. > > You might investigate wether property(i) could be implemented using a > faster algorithm, maybe table lookup, or interpolation from initial > table lookup. > > - Paddy. Thanks, Paddy. Those are interesting general comments for the general problem. By the way, I noticed two things: I forgot to write randint in the third approach: > > a=globalRNG.randint(1,len(a_list)) The warning "The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet." means a person, but not a bot, may see my email address, so it is safe to use my real address next time... -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest method to choose a random element
On Jan 5, 5:07 pm, [EMAIL PROTECTED] wrote: > Hello, Paul and Arnaud. > While I think about your answers: do you think there is any way to > avoid shuffle? > It may take unnecessary long on a long list most of whose elements > have the property. Umm... You provide nice answers in the case many elements are picked from the same list. Any ideas for the case when the picker is called many times on a program, but never twice with the same list? -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest method to choose a random element
Hello, Paul and Arnaud. While I think about your answers: do you think there is any way to avoid shuffle? It may take unnecessary long on a long list most of whose elements have the property. -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest method to choose a random element
On Jan 5, 9:50 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > On Jan 5, 5:12 pm, Paul Hankin <[EMAIL PROTECTED]> wrote: > > > > > On Jan 5, 4:14 pm, [EMAIL PROTECTED] wrote: > > > > On Jan 5, 5:07 pm, [EMAIL PROTECTED] wrote: > > > > > Hello, Paul and Arnaud. > > > > While I think about your answers: do you think there is any way to > > > > avoid shuffle? > > > > It may take unnecessary long on a long list most of whose elements > > > > have the property. > > > You could generate a random shuffle of range(len(seq)) lazily, and use > > that to iterate over your sequence. > > > import random > > import itertools > > > def randxrange(n): > > "Generate the numbers 0 to n-1 in a random order" > > x = range(n) > > for i in xrange(n): > > r = random.randrange(n - i) > > yield x[r] > > x[r] = x[n - i - 1] > > > def shuffled(seq): > > "Generate the elements of seq in a random order" > > return (seq[i] for i in randxrange(len(seq))) > > > def pick_random(seq, prop): > > return itertools.ifilter(prop, shuffled(seq)).next() > > At the risk of filling this thread with my posts, here's a much > simplified and faster version of this code that uses no extra storage. > > import random > > def pick_random(seq, prop): > L = len(seq) > for i in xrange(L): > r = random.randrange(L - i) > if prop(seq[r]): > return seq[r] > seq[r] = seq[L - i - 1] > > -- > Paul Hankin This one is good. Someone commented that you destroy the list, but that can be fixed: def pick_random(seq, prop): L = len(seq) for i in xrange(L): r = random.randrange(L - i) if prop(seq[r]): return seq[r] seq[r], seq[L - i - 1]= seq[L - i - 1],seq[r] just pushing elements without the property to the end of the list (list is mutated, true, but shuffle will do that too). In each iteration of the for loop, there is only one random element, one check of the property, and rebinding of elements without altering the lenght of the list. This is optimal and has all the functionality. Two more comments: for buzcor: deleting an element in the middle of a list is costly for martin: that is certainly enough for me Thanks everybody! -- http://mail.python.org/mailman/listinfo/python-list
Re: Request for help with Image color space conversion
> > ...where the image data is loaded into a numpy array > (1600x1200x3)... One comment: that is a big array, too big for the cache memory. I know that in these cases it makes a difference how many times the slices of the array are loaded and unloaded from RAM onto cache. One issue is that a 2D array l[i,j] is actually a 1D array: either l[i,j]=l[i*N+j] or l[i,j]=l[j*N+i] I don't know which in python/numpy. In the first case, when you fix i and change j, you get consecutive positions in the underlying 1D array, and they all belong to the same slice, which is loaded onto cache very fast as a block. If you do the opposite, you are making jumps to distant positions in the array (not a slice), and performance suffers. In gimp, for example, the underlying array is split into 'tiles', which are 64x64 pixel regions that can be loaded/unloaded as a block. And that's about all I know on the issue. So it ma -- http://mail.python.org/mailman/listinfo/python-list
Re: fastest method to choose a random element
> Just for fun, I profiled my answer versus the final answer... This mailing list is awesome! PS:ajaksu, I have to leave now, I hope bukzor's answer was enough to you (at least for the moment) -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get all the variables in a python shell
Your project interests me. Actually I was thinking about doing the same. I hadn't worked on it at all, but I though about it and had the idea about reading the session namespace directly, which I though would be stored in the __dict__ attribute of something. After reading your post, I have been trying a little bit, and I have found a way to do it with ipython. If you open an ipython console, press _ then hit TAB, you'll see it stores some useful information, including all input, all output, and after some searching, a dictionary matching all variables to its values. __IPYTHON__.user_ns There is a little extra stuff in there that you don't want, but that can be easily filtered (the extra stuff is either 'In', 'Out', 'help' or starts with '_'). I've tried it, and you can change the value in that dict to alter the value of the real variable. Say you have a variable 'test': test=5 __IPYTHON__.user_ns['test']=4 print test #prints 5 If I get it right, python is a dynamic language, and you won't break things by messing around with its inner stuff like this, but you better check it. Is this what you had in mind? -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get all the variables in a python shell
I meant it prints 4, which means the value of test is modified by the access to the dict > test=5 > __IPYTHON__.user_ns['test']=4 > print test #prints 4 > -- http://mail.python.org/mailman/listinfo/python-list
Re: How to get all the variables in a python shell
Have you seen this page? http://matplotlib.sourceforge.net/screenshots.html On watching this, I wouldn't say matplotlib is inferior to matlab plotting. Also, I don't know what they use in sage, but they have 3D plots of surfaces that you can rotate with the mouse. Do as you like, but if you want to "intergrate as many exsiting computation libraries as possible" you may end up doing something too similar to sage. I wouldn't want to go on such a trip alone, so even if sage is not exactly what I would do, I will probably work with them. Their client-server approach should make it easy to work on a cool interface without messing too much with their code. It's true, you'll have to carry with you a lot of symbolic computation tools that may be you don't need as an engineer, but is it that important? The client-server approach has other advantages: if you have a very lightweight computer (like EEE), you can place the server at home and the lightweight computer is enough to have a full scientific environment outdoors. And yes, I'm pretty sure you can call any library from within sage the same way you'd do it from python. Regards Pablo -- http://mail.python.org/mailman/listinfo/python-list