I know there's a performance penalty for running Python on a multicore CPU, but how bad is it? I've read the key paper ("www.dabeaz.com/python/GIL.pdf"), of course. It would be adequate if the GIL just limited Python to running on one CPU at a time, but it's worse than that; there's excessive overhead due to a lame locking implementation. Running CPU-bound multithreaded code on a dual-core CPU runs HALF AS FAST as on a single-core CPU, according to Beasley.
My main server application, which runs "sitetruth.com" has both multiple processes and multiple threads in each process. The system rates web sites, which involves reading and parsing up to 20 pages from each domain. Analysis of each domain is performed in a separate process, but each process uses multiple threads to read process several web pages simultaneously. Some of the threads go compute-bound for a second or two at a time as they parse web pages. Sometimes two threads (but never more than three) in the same process may be parsing web pages at the same time, so they're contending for CPU time. So this is nearly the worst case for the lame GIL lock logic. Has anyone tried using "affinity" ("http://pypi.python.org/pypi/affinity") to lock each Python process to a single CPU? Does that help? John Nagle -- http://mail.python.org/mailman/listinfo/python-list