On Feb 4, 10:46 am, John Nagle <na...@animats.com> wrote: > > There's enough intercommunication between the threads working on > a single site that it's a pain to do them as subprocesses. And I > definitely don't want to launch subprocesses for each page; the > Python load time would be worse than the actual work. The > subprocess module assumes you're willing to launch a subprocess > for each transaction.
You could perhaps use a process pool inside each domain worker to work on the pages? There is multiprocessing.Pool and other implementations. For examples, in this library, you can s/ThreadPool/ProcessPool/g and this example would work: <http://www.onideas.ws/stream.py/#retrieving- web-pages-concurrently>. If you want to DIY, with multiprocessing.Lock/Pipe/Queue, I don't understand why it would be more of a pain to write your threads as processes. // aht http://blog.onideas.ws -- http://mail.python.org/mailman/listinfo/python-list