fastest method to choose a random element

2008-01-04 Thread caca
  Hello,
  This is a question for the best method (in terms of performance
only) to choose a random element from a list among those that satisfy
a certain property.

  This is the setting: I need to pick from a list a random element
that satisfies a given property. All or none of the elements may have
the property. Most of the time, many of the elements will satisfy the
property, and the property is a bit expensive to evaluate. Chance of
having the property are uniform among elements.

  A simple approach is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Returns None if no element  has the property
'''
random.shuffle(a_list)
for i in a_list:
if property(i): return i

but that requires to shuffle the list every time.

 A second approach, that works if we know that at least one element of
the list has the property, is:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Loops forever if no element has the property
'''
while 1:
i=random.choice(a_list)
if property(i): return i

which is more efficient (on average) if many elements of the list have
the property and less efficient if only few elements of the list has
the property (and goes crazy if no element has the property)

Yet another one:

import random
def random_pick(a_list,property):
'''Returns a random element from a list that has the property
'''
b_list=[x for  x in a_list if property(x)]
try:
return random.choice(b_list)
finally: return None

but this one checks the property on all the elements, which is no
good.

I don't need strong random numbers, so a simple solution like:
import random
globalRNG=random.Random()

def random_pick(a_list,property):
'''Returns a random element from a list that has the property

Works only if len(a_list)+1 is prime: uses Fermat's little theorem
'''
a=globalRNG(1,len(a_list))
ind=a
for i in xrange(len(a_list)):
x=a_list[a-1]
if property(x):return x
ind*=a

but this works only if len(a_list)+1 is prime!!! Now this one could be
saved if there were an efficient method to find a prime number given
than a number n but not very much greater...

Any other ideas? Thanks everybody
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest method to choose a random element

2008-01-05 Thread caca

> Caching might help.
>
> If random_pick is called several times with the same list(s) then
> cache the result of
>  [property(i) for i in a_list] against a_list.
>
> If random_pick is called several times with list(s) with multiple
> instances of list items then cache
>  property(i) against i for i in a_list .
>
> You could do both.
>
> You might investigate wether property(i) could be implemented using a
> faster algorithm, maybe table lookup, or interpolation from initial
> table lookup.
>
> - Paddy.

Thanks, Paddy. Those are interesting general comments for the general
problem.
By the way, I noticed two things:

I forgot to write randint in the third approach:

> > a=globalRNG.randint(1,len(a_list))

The warning "The group you are posting to is a Usenet group. Messages
posted to this group will make your email address visible to anyone on
the Internet." means a person, but not a bot, may see my email
address, so it is safe to use my real address next time...

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest method to choose a random element

2008-01-05 Thread caca
On Jan 5, 5:07 pm, [EMAIL PROTECTED] wrote:
> Hello, Paul and Arnaud.
> While I think about your answers: do you think there is any way to
> avoid shuffle?
> It may take unnecessary long on a long list most of whose elements
> have the property.

Umm...
You provide nice answers in the case many elements are picked from the
same list.
Any ideas for the case when the picker is called many times on a
program, but never twice with the same list?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest method to choose a random element

2008-01-05 Thread caca
Hello, Paul and Arnaud.
While I think about your answers: do you think there is any way to
avoid shuffle?
It may take unnecessary long on a long list most of whose elements
have the property.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest method to choose a random element

2008-01-05 Thread caca
On Jan 5, 9:50 pm, Paul Hankin <[EMAIL PROTECTED]> wrote:
> On Jan 5, 5:12 pm, Paul Hankin <[EMAIL PROTECTED]> wrote:
>
>
>
> > On Jan 5, 4:14 pm, [EMAIL PROTECTED] wrote:
>
> > > On Jan 5, 5:07 pm, [EMAIL PROTECTED] wrote:
>
> > > > Hello, Paul and Arnaud.
> > > > While I think about your answers: do you think there is any way to
> > > > avoid shuffle?
> > > > It may take unnecessary long on a long list most of whose elements
> > > > have the property.
>
> > You could generate a random shuffle of range(len(seq)) lazily, and use
> > that to iterate over your sequence.
>
> > import random
> > import itertools
>
> > def randxrange(n):
> > "Generate the numbers 0 to n-1 in a random order"
> > x = range(n)
> > for i in xrange(n):
> > r = random.randrange(n - i)
> > yield x[r]
> > x[r] = x[n - i - 1]
>
> > def shuffled(seq):
> > "Generate the elements of seq in a random order"
> > return (seq[i] for i in randxrange(len(seq)))
>
> > def pick_random(seq, prop):
> > return itertools.ifilter(prop, shuffled(seq)).next()
>
> At the risk of filling this thread with my posts, here's a much
> simplified and faster version of this code that uses no extra storage.
>
> import random
>
> def pick_random(seq, prop):
> L = len(seq)
> for i in xrange(L):
> r = random.randrange(L - i)
> if prop(seq[r]):
> return seq[r]
> seq[r] = seq[L - i - 1]
>
> --
> Paul Hankin

This one is good. Someone commented that you destroy the list, but
that can be fixed:

 def pick_random(seq, prop):
 L = len(seq)
 for i in xrange(L):
 r = random.randrange(L - i)
 if prop(seq[r]):
 return seq[r]
 seq[r], seq[L - i - 1]= seq[L - i - 1],seq[r]

just pushing elements without the property to the end of the list
(list is mutated, true, but shuffle will do that too). In each
iteration of the for loop, there is only one random element, one check
of the property, and rebinding of elements without altering the lenght
of the list. This is optimal and has all the functionality.

Two more comments:
for buzcor: deleting an element in the middle of a list is costly
for martin: that is certainly enough for me

Thanks everybody!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Request for help with Image color space conversion

2008-01-06 Thread caca
>
> ...where the image data is loaded into a numpy array
> (1600x1200x3)...


One comment: that is a big array, too big for the cache memory. I know
that in these cases it makes a difference how many times the slices of
the array are loaded and unloaded from RAM onto cache. One issue is
that a 2D array l[i,j] is actually a 1D array:
either
l[i,j]=l[i*N+j]
or
l[i,j]=l[j*N+i]
I don't know which in python/numpy. In the first case, when you fix i
and change j, you get consecutive positions  in the underlying 1D
array, and they all belong to the same slice, which is loaded onto
cache very fast as a block. If you do the opposite, you are making
jumps to distant positions in the array (not a slice), and performance
suffers.

In gimp, for example, the underlying array is split into 'tiles',
which are 64x64 pixel regions that can be loaded/unloaded as a block.
And that's about all I know on the issue.

So it ma






-- 
http://mail.python.org/mailman/listinfo/python-list


Re: fastest method to choose a random element

2008-01-07 Thread caca

> Just for fun, I profiled my answer versus the final answer...

This mailing list is awesome!


PS:ajaksu, I have to leave now, I hope bukzor's answer was enough to
you (at least for the moment)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How to get all the variables in a python shell

2008-05-30 Thread caca
  Your project interests me. Actually I was thinking about doing the
same. I hadn't worked on it at all, but I though about it and had the
idea about reading the session namespace directly, which I though
would be stored in the __dict__ attribute of something.

  After reading your post, I have been trying a little bit, and I have
found a way to do it with ipython. If you open an ipython console,
press _ then hit TAB, you'll see it stores some useful information,
including all input, all output, and after some searching, a
dictionary matching all variables to its values.

__IPYTHON__.user_ns

  There is a little extra stuff in there that you don't want, but that
can be easily filtered (the extra stuff is either 'In', 'Out', 'help'
or starts with '_'). I've tried it, and you can change the value in
that dict to alter the value of the real variable. Say you have a
variable 'test':

test=5
__IPYTHON__.user_ns['test']=4
print test #prints 5

  If I get it right, python is a dynamic language, and you won't break
things by messing around with its inner stuff like this, but you
better check it.

  Is this what you had in mind?
--
http://mail.python.org/mailman/listinfo/python-list


Re: How to get all the variables in a python shell

2008-05-31 Thread caca
I meant it prints 4, which means the value of test is modified by the
access to the dict

> test=5
> __IPYTHON__.user_ns['test']=4
> print test #prints 4
>

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to get all the variables in a python shell

2008-05-31 Thread caca

  Have you seen this page?
http://matplotlib.sourceforge.net/screenshots.html
  On watching this, I wouldn't say matplotlib is inferior to matlab
plotting. Also, I don't know what they use in sage, but they have 3D
plots of surfaces that you can rotate with the mouse.
  Do as you like, but if you want to "intergrate as many exsiting
computation libraries as possible" you may end up doing something too
similar to sage. I wouldn't want to go on such a trip alone, so even
if sage is not exactly what I would do, I will probably work with
them. Their client-server approach should make it easy to work on a
cool interface without messing too much with their code. It's true,
you'll have to carry with you a lot of symbolic computation tools that
may be you don't need as an engineer, but is it that important? The
client-server approach has other advantages: if you have a very
lightweight computer (like EEE), you can place the server at home and
the lightweight computer is enough to have a full scientific
environment outdoors. And yes, I'm pretty sure you can call any
library from within sage the same way you'd do it from python.
 Regards
Pablo
--
http://mail.python.org/mailman/listinfo/python-list