Dan Stromberg <drsali...@gmail.com>: > On Fri, Nov 14, 2014 at 10:42 AM, Empty Account <empty...@gmail.com> wrote: >> I am thinking about writing a load test tool in Python, so I am >> interested in how I can create the most concurrent threads/processes >> with the fewest OS resources. I would imagine that I/O would need to >> be non-blocking. > > If you need a large amount of concurrency, you might look at Jython > with threads. Jython threads well. > > If you don't intend to do more than a few hundred concurrent things, > you might just go with CPython and multiprocessing.
It is very rare to need a "large amount of concurrency." Most hardware doesn't even support it. The number of CPU cores (plus hyperthreads) poses a physical limit that cannot be exceeded. Also, the I/O throughput will almost certainly be more limiting than the CPU. What I'm getting at is that it is generally not a good idea to represent a large number of simultaneous operations with an equal number of threads or processes. While the simplicity of that idea is enticing, it often leads to an expensive refactoring years down the road (been there, done that). Instead, address the true concurrency needs with a group/pool of processes or threads, and represent your simultaneous contexts with objects that you map onto the processes or threads. If your application does not involve obnoxious, blocking library calls (eg, database access), you might achieve top throughput with a single process (no threads). Java had to reinvent their whole stdlib I/O paradigm to address the scalability problems of the naive zillion-thread approach (NIO). Python is undergoing a similar transformation (asyncio), although it has always provided low-level facilities for "doing the right thing." To summarize, this is how I implement these kinds of applications: 1. If all I/O is nonblocking (and linux's blocking file I/O doesn't get in the way), I implement the application single-threaded. In Python, I use select.epoll(EPOLLET) with callbacks. Python's new asyncio framework is a portable, funky way to implement the same idea. 2. If I must deal with blocking I/O calls, I set up a pool of processes. The size of the pool is calculated from several factors: the number of CPU cores, network latencies and the server throughput. 3. I generally much prefer processes over threads because they provide for better fault-tolerance. Marko -- https://mail.python.org/mailman/listinfo/python-list