Mike Meyer wrote:
Jack Diederich <[EMAIL PROTECTED]> writes:


From reading this
thread every couple months on c.l.py for the last few years it is my opinion that the number of people who think threading is the only solution
to their problem greatly outnumber the number of people who actually have such a problem (like, nearly all of them).


Here here. I find that threading typically introduces worse problems
than it purports to solve.

In my experience, threads should mainly be used if you need asynchronous access to a synchronous operation. You spawn the thread to make the call, it blocks on the relevant API, then notifies the main thread when it's done.


Since any sane code will release the GIL before making the blocking call, this scales to multiple CPU's just fine.

Another justification for threads is when you have a multi-CPU machine, and a processor intensive operation you'd like to farm off to a separate CPU. In that case, you can treat the long-running operation like any other synchronous call, and farm off a thread that releases the GIL before start the time-consuming operation.

The only time the GIL "gets in the way" is if the long-running operation you want to farm off is itself implemented in Python.

However, consider this: threads run on a CPU, so if you want to run multiple threads concurrently, you either need multiple CPU's or a time-slicing scheduler that fakes it.

Here's the trick: PYTHON THREADS DO NOT RUN DIRECTLY ON THE CPU. Instead, they run on a Python Virtual Machine (or the JVM/CLR Runtime/whatever), which then runs on the CPU. So, if you want to run multiple Python threads concurrently, you need multiple PVM's or a timeslicing scheduler. The GIL represents the latter.

Now, Python *could* try to provide the ability to have multiple virtual machines in a single process in order to more effectively exploit multiple CPU's. I have no idea if Java or the CLR work that way - my guess it that they do (or something that looks the same from a programmer's POV). But then, they have Sun/Microsoft directly financing the development teams.

A much simpler suggestion is that if you want a new PVM, just create a new OS process to run another copy of the Python interpreter. The effectiveness of your multi-CPU utilisation will then be governed by your OS's ability to correctly schedule multiple processes rather than by the PVM's ability to fake multiple processes using threads (Hint: the former is likely to be much better than the latter).

Additionally, schemes for inter-process communication are often far more scaleable than those for inter-thread communication, since the former generally can't rely on shared memory (although good versions may utilise it for optimisation purposes). This means they can usually be applied to clustered computing rather effectively.

I would *far* prefer to see effort expended on making the idiom mentioned in the last couple of paragraphs simple and easy to use, rather than on a misguided effort to "Kill the GIL".

Cheers,
Nick.

P.S. If the GIL *really* bothers you, check out Stackless Python. As I understand it, it does its best to avoid the C stack (and hence threads) altogether.

--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---------------------------------------------------------------
            http://boredomandlaziness.skystorm.net
--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to