On 17 May, 22:27, Paul Boddie <p...@boddie.org.uk> wrote: > On 17 Mai, 14:05, jer...@martinfamily.freeserve.co.uk wrote: > > > From a user point of view I think that adding a 'par' construct to > > Python for parallel loops would add a lot of power and simplicity, > > e.g. > > > par i in list: > > updatePartition(i) > > You can do this right now with a small amount of work to make > updatePartition a callable which works in parallel, and without the > need for extra syntax. For example, with the pprocess module, you'd > use boilerplate like this: > > import pprocess > queue = pprocess.Queue(limit=ncores) > updatePartition = queue.manage(pprocess.MakeParallel > (updatePartition)) > > (Seehttp://www.boddie.org.uk/python/pprocess/tutorial.html#Mapfor > details.) > > At this point, you could use a normal "for" loop, and you could then > "sync" for results by reading from the queue. I'm sure it's a similar > story with the multiprocessing/processing module. > > > There would be no locking and it would be the programmer's > > responsibility to ensure that the loop was truly parallel and correct. > > Yes, that's the idea. > > > The intention of this would be to speed up Python execution on multi- > > core platforms. Within a few years we will see 100+ core processors as > > standard and we need to be ready for that. > > In what sense are we not ready? Perhaps the abstractions could be > better, but it's definitely possible to run Python code on multiple > cores today and get decent core utilisation. > > > There could also be parallel versions of map, filter and reduce > > provided. > > Yes, that's what pprocess.pmap is for, and I imagine that other > solutions offer similar facilities. > > > BUT...none of this would be possible with the current implementation > > of Python with its Global Interpreter Lock, which effectively rules > > out true parallel processing. > > > See:http://jessenoller.com/2009/02/01/python-threads-and-the-global-inter... > > > What do others think? > > That your last statement is false: true parallel processing is > possible today. See the Wiki for a list of solutions: > > http://wiki.python.org/moin/ParallelProcessing > > In addition, Jython and IronPython don't have a global interpreter > lock, so you have the option of using threads with those > implementations, too. > > Paul
Hi Paul and others, Thanks for your responses to my original questions. Paul, thanks for explaining about the pprocess module which appears very useful. I presume that this is using multiple operating system processes rather than threads which would probably imply that it is suitable for coarse grained parallel programming rather than fine- grained because of overhead in starting up new processes and sharing objects. (How is that done, by the way?). It probably has advantages and disadvantages compared with thread based parallelism. My suggestion is primarily about using multiple threads and sharing memory - something akin to the OpenMP directives that one of you has mentioned. To do this efficiently would involve removing the Global Interpreter Lock, or switching to Jython or Iron Python as you mentioned. However I *do* actually want to add syntax to the language. I think that 'par' makes sense as an official Python construct - we already have had this in the Occam programming language for twenty-five years. The reason for this is ease of use. I would like to make it easy for amateur programmers to exploit natural parallelism in their algorithms. For instance somebody who wishes to calculate a property of each member from a list of chemical structures using the Python Daylight interface: with my suggestion they could potentially get a massive speed up just by changing 'for' to 'par' or 'map' to 'pmap'. (Or map with a parallel keyword argument set as suggested). At present they would have to manually chop up their work and run it as multiple processes in order to achieve the same - fine for expert programmers but not reasonable for people working in other domains who wish to use Python as a utility because of its fantastic productivity and ease of use. Let me clarify what I think par, pmap, pfilter and preduce would mean and how they would be implemented. A par loop is like a for loop, however the programmer is saying that the order in which the iterations are performed doesn't matter and they might be performed in parallel. The python system then has the option to allocate a number of threads to the task and share out the iterations accordingly between the threads. (It might be that the programmer should be allowed to explictly define the number of threads to use or can delegate that decision to the system). Parallel pmap and pfilter would be implemented in much the same way, although the resultant list might have to be reassembled from the partial results returned from each thread. As people have pointed out, parallel reduce is a tricky option because it requires the binary operation to be associative in which case it can be parallelised by calculating the result using a tree- based evaluation strategy. I have used all of OpenMP, MPI, and Occam in the past. OpenMP adds parallelism to programs by the use of special comment strings, MPI by explicit calls to library routines, and Occam by explicit syntactical structures. Each has its advantages. I like the simplicity of OpenMP, the cross-language portability of MPI and the fact the concurrency is built in to the Occam language. What I am proposing here is a hybrid of the OpenMP and Occam approaches - a change to the language which is very natural and yet is easy for programmers to understand. Concurrency is generally regarded as the hardest concept for programmers to grasp. Jeremy -- http://mail.python.org/mailman/listinfo/python-list