Re: multiprocessing problems
Hi Doxa, DoxaLogos wrote: [...] I found out my problems. One thing I did was followed the test queue example in the documentation, but the biggest problem turned out to be a pool instantiated globally in my script was causing most of the endless process spawn, even with the "if __name__ == "__main__":" block. Problems who solves them self, are the best problems ;) One tip: currently your algorithm has some overhead. 'Cause you are starting 4 time an additional python interpreter, compute the files and, closing all new spawned interpreter and starting again 4 interpreter, which are processing the files. For such kind of jobs I prefer to start processes once and feeding them with data via a queue. This reduces some overhead and increase runtime performance. This could look like this: (due some pseudo functions not directly executeable -> untested) import multiprocessing import Queue class Worker(multiprocessing.Process): def __init__(self, feeder_q, queue_filled): multiprocessing.Process.__init__(self) self.feeder_q = feeder_q self.queue_filled = queue_filled def run(self): serve = True # start infinite loop while serve: try: # scan queue for work, will block process up to 5 seconds. If until then no item is in queue a Queue.Empty will be raised text = self.feeder_q.get(True, timeout=5) if text: do_stuff(text) # very important! tell the queue that the fetched work has been finished # otherwise the feeder_q.join() would block infinite self.input_queue.task_done() except Queue.Empty: # as soon as queue is empty and all work has been enqueued # process can terminate itself if self.queue_filled.is_set() and feeder_q.empty(): serve = False return if __name__ == '__main__': number_of_processes = 4 queue_filled = multiprocessing.Event() feeder_q = multiprocessing.JoinableQueue() process_list =[] # get file name which need to be processed all_files = get_all_files() # start processes for i in xrange(0,number_of_processes): process = Worker(feeder_q, queue_filled) process.start() process_list.append(thread) # start feeding for file in all_files: feeder_q.put(file) # inform processes that all work has been ordered queue_filled.set() # wait until queue is empty feeder_q.join() # wait until all processed have finished their jobs for process in process_list: process.join() Cheers, Nils -- http://mail.python.org/mailman/listinfo/python-list
Re: multiprocessing problems
Hi, Adam Tauno Williams wrote: [...] Here's the guts of my latest incarnation. def ProcessBatch(files): p = [] for file in files: p.append(Process(target=ProcessFile,args=file)) for x in p: x.start() for x in p: x.join() p = [] return Now, the function calling ProcessBatch looks like this: def ReplaceIt(files): processFiles = [] for replacefile in files: if(CheckSkipFile(replacefile)): processFiles.append(replacefile) if(len(processFiles) == 4): ProcessBatch(processFiles) processFiles = [] #check for left over files once main loop is done and process them if(len(processFiles) > 0): ProcessBatch(processFiles) According to this you will create files is sets of four, but an unknown number of sets of four. This is not correct, 'cause of the x.join(). This will block the parent process until all processes have been terminated. So as soon as the current set of processes have finished their job, a new set will be spawned. Cheers, Nils -- http://mail.python.org/mailman/listinfo/python-list
Re: source install of python2.7 and rpm install of cx_Oracle collision
Hi Jim, Jim Qiu wrote: [...] I find out that only libpython2.7.a generated when I install python2.7, who can tell me what I need to do ? I want a libpython2.7.so.1.0 generated when I've didn't read your complete mail... In addition to the steps I've described in my other mail, you need to the "configure" script, that you like to have shared libraries. So you need to add --enable-shared to your configure call: ./configure --prefix=/opt/Python2.7a --enable-shared Now you got the shared libraries in the lib folder. Cheers, Nils -- http://mail.python.org/mailman/listinfo/python-list
Re: source install of python2.7 and rpm install of cx_Oracle collision
Hi Jim, Jim Qiu wrote: I make installed python 2.7 from source, and also installed the RPM version of cx_Oracle for python 2.7. But ldd tells me : #ldd cx_Oracle.so libpython2.7.so.1.0 => not found I find out that only libpython2.7.a generated when I install python2.7, who can tell me what I need to do ? I want a libpython2.7.so.1.0 generated when [...] Due to the fact that you have compiled python from source, the python library is not in the defaul library path. Due to that the lib can't be found. As a quick workourd you can extend the LD_LIBRARY_PATH variable with the path and check if cx_Orcacle get now the lib. Lets assume you "installed" python at /opt/Python2.7a, then you need to extend the variable in this way: export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/opt/Python2.7a/lib" afterwards do the ldd again and check if the lib could be found now. If not, you've got the wrong path. Now you need to perist the changed library path. For this, you need to be root: 1. echo "/opt/Python2.7a/lib" > /etc/ld.so.conf.d/Pathon2.7a.conf 2. refresh your shared library cache with "ldconfig" Now you OS knows the new library location. But this is not clean. As Daniel mention, you should try to get a rpm. Otherwise you may get in trouble, if you install a newer Python2.7 version and forget to maintain you library paths. Cheers, Nils -- http://mail.python.org/mailman/listinfo/python-list
Re: Seeding the rand() Generator
Hi Fred, I just saw your SQL Statement An example would be: SELECT first, second, third, fourth, fifth, sixth from sometable order by rand() limit 1 and I feel me constrained to give you an advice. Don't use this SQL statement to pick up a random row, your user and maybe DBA would much appreciate it. You are certainly asking why. Lets have a brief look what you are asking your mysql DB: Fetch all rows from 'sometable', but only with attribute 'first, second,...' sort them all starting at 'random row' and afterward through anything away you did before, but the first line If you have a table with 10 rows you would fetch and sort up to 10 rows, pick up one row and discard up to 9 rows. That sounds not very clever, right? So please take a look at this site to get a better alternate way for that approach: http://jan.kneschke.de/projects/mysql/order-by-rand/ if you want to know more please check this article too: http://jan.kneschke.de/2007/2/22/analyzing-complex-queries regards, Nils -- http://mail.python.org/mailman/listinfo/python-list
Re: Standardizing RPython - it's time.
Hi, On 10/12/2010 07:41 AM, John Nagle wrote: [...] With Unladen Swallow looking like a failed IT project, a year behind schedule and not delivering anything like the promised performance, Google management may pull the plug on funding. Since there hasn't been a "quarterly release" in a year, that may already have effectively happened. are you sure that the project has failed? Yes the project page is outdated, but there is also pep-3146 [1] "Merging Unladen Swallow into CPython" Regarding the accepted pep, the merge is planed for Python 3.3. So for me it looks like the development itself has finished and the merging has started... Does anyone has more details? Regrads, Nils [1] http://www.python.org/dev/peps/pep-3146/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Standardizing RPython - it's time.
On 10/12/2010 05:18 PM, Nils Ruettershoff wrote: Hi, On 10/12/2010 07:41 AM, John Nagle wrote: [...] With Unladen Swallow looking like a failed IT project, a year behind schedule and not delivering anything like the promised performance, Google management may pull the plug on funding. Since there hasn't been a "quarterly release" in a year, that may already have effectively happened. are you sure that the project has failed? Yes the project page is outdated, but there is also pep-3146 [1] "Merging Unladen Swallow into CPython" Regarding the accepted pep, the merge is planed for Python 3.3. So for me it looks like the development itself has finished and the merging has started... I've just found a statement from Collin Winter [1]. He says, that he is working on other google projects, but they are still targeting the merge Unladen Swallow with cPython 3.3. The last commitmend to the branch (py3-jit) was in June this year So I would say the project is not dead or has failed, but delayed. Cheers, Nils [1] http://groups.google.com/group/unladen-swallow/browse_thread/thread/f2011129c4414d04 -- http://mail.python.org/mailman/listinfo/python-list