Re: multiprocessing problems

2010-01-20 Thread Nils Ruettershoff

Hi Doxa,

DoxaLogos wrote:
[...]

I found out my problems.  One thing I did was followed the test queue
example in the documentation, but the biggest problem turned out to be
a pool instantiated globally in my script was causing most of the
endless process spawn, even with the "if __name__ == "__main__":"
block.
  


Problems who solves them self, are the best problems ;)

One tip: currently your algorithm has some overhead. 'Cause you are 
starting 4 time an additional python interpreter, compute the files and, 
closing all new spawned interpreter and starting again 4 interpreter, 
which are processing the files.


For such kind of jobs I prefer to start processes once and feeding them 
with data via a queue. This reduces some overhead and increase runtime 
performance.



This could look like this:
(due some pseudo functions not directly executeable -> untested)

import multiprocessing
import Queue

class Worker(multiprocessing.Process):
   def __init__(self, feeder_q, queue_filled):
   multiprocessing.Process.__init__(self)
   self.feeder_q = feeder_q
   self.queue_filled = queue_filled
  
   def run(self):

   serve = True
   # start infinite loop
   while serve:
   try:
   # scan queue for work, will block process up to 5 
seconds. If until then no item is in queue a Queue.Empty will be raised

   text = self.feeder_q.get(True, timeout=5)
   if text:
   do_stuff(text)
   # very important! tell the queue that the fetched 
work has been finished

   # otherwise the feeder_q.join() would block infinite
   self.input_queue.task_done()
   except Queue.Empty:
   # as soon as queue is empty and all work has been enqueued
   # process can terminate itself
   if self.queue_filled.is_set() and feeder_q.empty():
   serve = False
   return


if __name__ == '__main__':
   number_of_processes = 4
   queue_filled = multiprocessing.Event()
   feeder_q = multiprocessing.JoinableQueue()
   process_list =[]
   # get file name which need to be processed
   all_files = get_all_files()
   # start processes
   for i in xrange(0,number_of_processes):
   process = Worker(feeder_q, queue_filled)
   process.start()
   process_list.append(thread)
   # start feeding
   for file in all_files:
   feeder_q.put(file)
   # inform processes that all work has been ordered
   queue_filled.set()
   # wait until queue is empty
   feeder_q.join()
   # wait until all processed have finished their jobs
   for process in process_list:
   process.join()



Cheers,
Nils
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing problems

2010-01-20 Thread Nils Ruettershoff

Hi,

Adam Tauno Williams wrote:

[...]

Here's the guts of my latest incarnation.
def ProcessBatch(files):
p = []
for file in files:
p.append(Process(target=ProcessFile,args=file))
for x in p:
x.start()
for x in p:
x.join()
p = []
return
Now, the function calling ProcessBatch looks like this:
def ReplaceIt(files):
processFiles = []
for replacefile in files:
if(CheckSkipFile(replacefile)):
processFiles.append(replacefile)
if(len(processFiles) == 4):
ProcessBatch(processFiles)
processFiles = []
#check for left over files once main loop is done and process them
if(len(processFiles) > 0):
ProcessBatch(processFiles)



According to this you will create files is sets of four, but an unknown
number of sets of four.

  
This is not correct, 'cause of the x.join(). This will block the parent 
process until all processes have been terminated. So as soon as the 
current set of processes have finished their job, a new set will be spawned.


Cheers,
Nils
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: source install of python2.7 and rpm install of cx_Oracle collision

2010-07-22 Thread Nils Ruettershoff

Hi Jim,

Jim Qiu wrote:

[...]
I find out that only libpython2.7.a generated when I install 
python2.7, who can tell me what I need to do ?  I want a 
libpython2.7.so.1.0 generated when


I've didn't read your complete mail... In addition to the steps I've 
described in my other mail, you need to the "configure" script, that you 
like to have shared libraries.


So you need to add --enable-shared to your configure call:

./configure --prefix=/opt/Python2.7a --enable-shared

Now you got the shared libraries in the lib folder.

Cheers,
Nils
--
http://mail.python.org/mailman/listinfo/python-list


Re: source install of python2.7 and rpm install of cx_Oracle collision

2010-07-22 Thread Nils Ruettershoff

Hi Jim,

Jim Qiu wrote:


 I make installed python 2.7 from source, and also installed the RPM 
version of cx_Oracle for python 2.7.


 But  ldd tells me :
#ldd cx_Oracle.so
  libpython2.7.so.1.0 => not found

I find out that only libpython2.7.a generated when I install 
python2.7, who can tell me what I need to do ?  I want a 
libpython2.7.so.1.0 generated when




[...]

Due to the fact that you have compiled python from source, the python 
library is not in the defaul library path. Due to that the lib can't be 
found.


As a quick workourd you can extend the LD_LIBRARY_PATH variable with the 
path and check if cx_Orcacle get now the lib. Lets assume you 
"installed" python at /opt/Python2.7a, then you need to extend the 
variable in this way:


export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/Python2.7a/lib"

afterwards do the ldd again and check if the lib could be found now. If 
not, you've got the wrong path.


Now you need to perist the changed library path.

For this, you need to be root:

1. echo "/opt/Python2.7a/lib" > /etc/ld.so.conf.d/Pathon2.7a.conf
2. refresh your shared library cache with "ldconfig"

Now you OS knows the new library location.

But this is not clean. As Daniel mention, you should try to get a rpm. 
Otherwise you may get in trouble, if you install a newer Python2.7 
version and forget to maintain you library paths.


Cheers,
Nils
--
http://mail.python.org/mailman/listinfo/python-list


Re: Seeding the rand() Generator

2009-08-06 Thread Nils Ruettershoff

Hi Fred,

I just saw your SQL Statement

An example would be: SELECT first, second, third, fourth,
fifth, sixth from sometable order by rand() limit 1

  
and I feel  me  constrained to give you an advice. Don't use this SQL 
statement to pick up a random row, your user and maybe DBA would much 
appreciate it.
You are certainly asking why. Lets have a brief look what you are asking 
your mysql DB:


Fetch all rows from 'sometable', but only with attribute 'first, 
second,...' sort them all starting at 'random row' and afterward through 
anything away you did before, but the first line


If you have a table with 10 rows you would fetch and sort up to 
10 rows, pick up one row and discard up to 9 rows. That sounds 
not very clever, right?


So please take a look at this site to get a better alternate way for 
that approach:


http://jan.kneschke.de/projects/mysql/order-by-rand/

if you want to know more please check this article too:

http://jan.kneschke.de/2007/2/22/analyzing-complex-queries

regards,

Nils

--
http://mail.python.org/mailman/listinfo/python-list


Re: Standardizing RPython - it's time.

2010-10-12 Thread Nils Ruettershoff

Hi,

On 10/12/2010 07:41 AM, John Nagle wrote:

[...]
With Unladen Swallow looking like a failed IT project, a year
behind schedule and not delivering anything like the promised
performance, Google management may pull the plug on funding.
Since there hasn't been a "quarterly release" in a year, that
may already have effectively happened.


are you sure that the project has failed? Yes the project page is 
outdated, but there is also pep-3146 [1] "Merging Unladen Swallow into 
CPython"


Regarding the accepted pep, the merge is planed for Python 3.3. So for 
me it looks like the development itself has finished and the merging has 
started...


Does anyone has more details?

Regrads,
Nils


[1] http://www.python.org/dev/peps/pep-3146/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Standardizing RPython - it's time.

2010-10-12 Thread Nils Ruettershoff

On 10/12/2010 05:18 PM, Nils Ruettershoff wrote:

Hi,

On 10/12/2010 07:41 AM, John Nagle wrote:

[...]
With Unladen Swallow looking like a failed IT project, a year
behind schedule and not delivering anything like the promised
performance, Google management may pull the plug on funding.
Since there hasn't been a "quarterly release" in a year, that
may already have effectively happened.


are you sure that the project has failed? Yes the project page is 
outdated, but there is also pep-3146 [1] "Merging Unladen Swallow into 
CPython"


Regarding the accepted pep, the merge is planed for Python 3.3. So for 
me it looks like the development itself has finished and the merging 
has started...




I've just found a statement from Collin Winter [1]. He says, that he is 
working on other google projects, but they are still targeting the merge 
Unladen Swallow with cPython 3.3. The last commitmend to the branch 
(py3-jit) was in June this year


So I would say the project is not dead or has failed, but delayed.

Cheers,
Nils

[1] 
http://groups.google.com/group/unladen-swallow/browse_thread/thread/f2011129c4414d04 

-- 
http://mail.python.org/mailman/listinfo/python-list