Re: multiprocessing vs thread performance

alex goretoy Mon, 05 Jan 2009 15:39:41 -0800

There doesn't seem to be any good examples on POSH or it's not clear to me.
For when using with a for loop like mk is doing who started this thread. How
would somethings like this be possible to do with POSH? The example show how
to share variables between processes/threads but nothing about How the
thread starts or a for loop.


-Alex Goretoy
http://www.alexgoretoy.com
somebodywhoca...@gmail.com


On Sat, Jan 3, 2009 at 1:31 PM, Nick Craig-Wood <n...@craig-wood.com> wrote:

> mk <mrk...@gmail.com> wrote:
> >  After reading http://www.python.org/dev/peps/pep-0371/ I was under
> >  impression that performance of multiprocessing package is similar to
> >  that of thread / threading. However, to familiarize myself with both
> >  packages I wrote my own test of spawning and returning 100,000 empty
> >  threads or processes (while maintaining at most 100 processes / threads
> >  active at any one time), respectively.
> >
> >  The results I got are very different from the benchmark quoted in PEP
> >  371. On twin Xeon machine the threaded version executed in 5.54 secs,
> >  while multiprocessing version took over 222 secs to complete!
> >
> >  Am I doing smth wrong in code below?
>
> Yes!
>
> The problem with your code is that you never start more than one
> process at once in the multiprocessing example.  Just check ps when it
> is running and you will see.
>
> My conjecture is that this is due to the way fork() works under unix.
> I think that when the parent forks it yields the CPU to the child.
> Because you are giving the child effectively no work to do it returns
> immediately, re-awakening the parent, thus serialising your jobs.
>
> If you give the children some work to do you'll see a quite different
> result.  I gave each child time.sleep(1) to do and cut down the total
> number to 10,000.
>
> $ ./test_multiprocessing.py
> == Process 1000 working ==
> == Process 2000 working ==
> == Process 3000 working ==
> == Process 4000 working ==
> == Process 5000 working ==
> == Process 6000 working ==
> == Process 7000 working ==
> == Process 8000 working ==
> == Process 9000 working ==
> == Process 10000 working ==
> === Main thread waiting for all processes to finish ===
> Total time: 101.382129192
>
> $ ./test_threading.py
> == Thread 1000 working ==
> == Thread 2000 working ==
> == Thread 3000 working ==
> == Thread 4000 working ==
> == Thread 5000 working ==
> == Thread 6000 working ==
> == Thread 7000 working ==
> == Thread 8000 working ==
> == Thread 9000 working ==
> == Thread 10000 working ==
> Total time:  100.659118176
>
> So almost identical results and as expected - we ran 10,000 sleep(1)s
> in 100 seconds so we must have been running 100 simultaneously.
>
> If you replace the "time.sleep(1)" with "for _ in xrange(1000000):
> pass" you get this much more interesting answer on my dual core linux
> laptop, showing nicely the effect of the contention on the python
> global interpreter lock and how multiprocessing avoids it.
>
> $ ./test_multiprocessing.py
> == Process 1000 working ==
> == Process 2000 working ==
> == Process 3000 working ==
> == Process 4000 working ==
> == Process 5000 working ==
> == Process 6000 working ==
> == Process 7000 working ==
> == Process 8000 working ==
> == Process 9000 working ==
> == Process 10000 working ==
> === Main thread waiting for all processes to finish ===
> Total time: 266.808327913
>
> $ ./test_threading.py
> == Thread 1000 working ==
> == Thread 2000 working ==
> == Thread 3000 working ==
> == Thread 4000 working ==
> == Thread 5000 working ==
> == Thread 6000 working ==
> == Thread 7000 working ==
> == Thread 8000 working ==
> == Thread 9000 working ==
> == Thread 10000 working ==
> Total time:  834.81882
>
> --
> Nick Craig-Wood <n...@craig-wood.com> -- http://www.craig-wood.com/nick
> --
> http://mail.python.org/mailman/listinfo/python-list
>

--
http://mail.python.org/mailman/listinfo/python-list

Re: multiprocessing vs thread performance

Reply via email to