I'm filing 160 million data points into a set of bins based on their position. At the moment, this takes just over an hour using interval trees. I would like to parallelise this to take advantage of my quad core machine. I have some experience of Parallel Python, but PP seems to only really work for problems where you can do one discrete bit of processing and recombine these results at the end.
I guess I could thread my code and use mutexes to protect the shared lists that everybody is filing into. However, my understanding is that Python is still only using one process so this won't give me multi- core. Does anybody have any suggestions for this? Peter -- http://mail.python.org/mailman/listinfo/python-list