Hello, While I don't pretend to be an authority on the subject, a few days of research has lead me to believe that a discussion needs to be started (or continued) on the state and direction of multi-threading python.
Python is not multi-threading friendly. Any code that deals with the python interpreter must hold the global interpreter lock (GIL). This has the effect of serializing (to a certain extent) all python specific operations. IE, any thread that is written purely in python will not release the GIL except at particular (and possibly non- optimal) times. Currently that's the rather arbitrary quantum of 100 bytecode instructions. Since the ability of the OS to schedule python threads is based on when its possible to run that thread (according to the lock), python threads do not benefit from a good scheduler in the same manner that real OS threads do, even though python threads are supposed to be a thin wrapper around real OS threads[1]. The detrimental effects of the GIL have been discussed several times and nobody has ever done anything about it. This is because the GIL isn't really that bad right now. The GIL isn't held that much, and pthreads spawned by python-C interations (Ie, those that reside in extensions) can do all their processing concurrently as long as they aren't dealing with python data. What this means is that python multithreading isn't really broken as long as python is thought of as a convenient way of manipulating C. After all, 100 bytecode instructions go by pretty quickly, so the GIL isn't really THAT invasive. Python, however, is much better than a convenient method of manipulating C. Python provides a simple language which can be implemented in any way, so long as promised behaviors continue. We should take advantage of that. The truth is that the future (and present reality) of almost every form of computing is multi-core, and there currently is no effective way of dealing with concurrency. We still worry about setting up threads, synchronization of message queues, synchronization of shared memory regions, dealing with asynchronous behaviors, and most importantly, how threaded an application should be. All of this is possible to do manually in C, but its hardly optimal. For instance, at compile time you have no idea if your library is going to be running on a machine with 1 processor or 100. Knowing that makes a huge difference in architecture as 200 threads might run fine on the 100 core machine where it might thrash the single processor to death. Thread pools help, but they need to be set up and initialized. There are very few good thread pool implementations that are meant for generic use. It is my feeling that there is no better way of dealing with dynamic threading than to use a dynamic language. Stackless python has proven that clever manipulation of the stack can dramatically improve concurrent performance in a single thread. Stackless revolves around tasklets, which are a nearly universal concept. For those who don't follow experimental python implementations, stackless essentially provides an integrated scheduler for "green threads" (tasklets), or extremely lightweight snippets of code that can be run concurrently. It even provides a nice way of messaging between the tasklets. When you think about it, lots of object oriented code can be organized as tasklets. After all, encapsulation provides an environment where side effects of running functions can be minimized, and is thus somewhat easily parallelized (with respect to other objects). Functional programming is, of course, ideal, but its hardly the trendy thing these days. Maybe that will change when people realize how much easier it is to test and parallelize. What these seemingly unrelated thoughts come down to is a perfect opportunity to become THE next generation language. It is already far more advanced than almost every other language out there. By integrating stackless into an architecture where tasklets can be divided over several parallelizable threads, it will be able to capitalize on performance gains that will have people using python just for its performance, rather than that being the excuse not to use it. The nice thing is that this requires a fairly doable amount of work. First, stackless should be integrated into the core. Then there should be an effort to remove the reliance on the GIL for python threading. After that, advanced features like moving tasklets amongst threads should be explored. I can imagine a world where a single python web application is able to redistribute its millions of requests amongst thousands of threads without the developer ever being aware that the application would eventually scale. An efficient and natively multi- threaded implementation of python will be invaluable as cores continue to multiply like rabbits. There has been much discussion on this in the past [2]. Those discussions, I feel, were premature. Now that stackless is mature (and continuation free!), Py3k is in full swing, and parallel programming has been fully realized as THE next big problem for computer science, the time is ripe for discussing how we will approach multi-threading in the future. Justin [1] I haven't actually looked at the GIL code. It's possible that it creates a bunch of wait queues for each nice level that a python thread is running at and just wakes up the higher priority threads first, thus maintaining the nice values determined by the scheduler, or something. I highly doubt it. I bet every python thread gets an equal chance of getting that lock despite whatever patterns the scheduler may have noticed. [2] http://groups.google.com/group/comp.lang.python/browse_thread/thread/7d16083cf34a706f/858e64a2a6a5d976?q=stackless%2C+thread+paradigm&lnk=ol& http://www.stackless.com/pipermail/stackless/2003-June/000742.html ... More that I lost, just search this group. -- http://mail.python.org/mailman/listinfo/python-list