On 5/11/15 5:25 PM, Robert Collins wrote:
Details: Skip over this bit if you know it all already. The GIL plays a big factor here: if you want to scale the amount of CPU available to a Python service, you have two routes: A) move work to a different process through some RPC - be that DB's using SQL, other services using oslo.messaging or HTTP - whatever. B) use C extensions to perform work in threads - e.g. openssl context processing. To increase concurrency you can use threads, eventlet, asyncio, twisted etc - because within a single process *all* Python bytecode execution happens inside the GIL lock, so you get at most one CPU for a CPU bound workload. For an IO bound workload, you can fit more work in by context switching within that one CPU capacity. And - the GIL is a poor scheduler, so at the limit - an IO bound workload where the IO backend has more capacity than we have CPU to consume it within our process, you will run into priority inversion and other problems. [This varies by Python release too]. request_duration = time_in_cpu + time_blocked request_cpu_utilisation = time_in_cpu/request_duration cpu_utilisation = concurrency * request_cpu_utilisation Assuming that we don't want any one process to spend a lot of time at 100% - to avoid such at-the-limit issues, lets pick say 80% utilisation, or a safety factor of 0.2. If a single request consumes 50% of its duration waiting on IO, and 50% of its duration executing bytecode, we can only run one such request concurrently without hitting 100% utilisations. (2*0.5 CPU == 1). For a request that spends 75% of its duration waiting on IO and 25% on CPU, we can run 3 such requests concurrently without exceeding our target of 80% utilisation: (3*0.25=0.75). What we have today in our standard architecture for OpenStack is optimised for IO bound workloads: waiting on the network/subprocesses/disk/libvirt etc. Running high numbers of eventlet handlers in a single process only works when the majority of the work being done by a handler is IO.
Everything stated here is great, however in our situation there is one unfortunate fact which renders it completely incorrect at the moment. I'm still puzzled why we are getting into deep think sessions about the vagaries of the GIL and async when there is essentially a full-on red-alert performance blocker rendering all of this discussion useless, so I must again remind us: what we have *today* in Openstack is *as completely un-optimized as you can possibly be*.
The most GIL-heavy nightmare CPU bound task you can imagine running on 25 threads on a ten year old Pentium will run better than the Openstack we have today, because we are running a C-based, non-eventlet patched DB library within a single OS thread that happens to use eventlet, but the use of eventlet is totally pointless because right now it blocks completely on all database IO. All production Openstack applications today are fully serialized to only be able to emit a single query to the database at a time; for each message sent, the entire application blocks an order of magnitude more than it would under the GIL waiting for the database library to send a message to MySQL, waiting for MySQL to send a response including the full results, waiting for the database to unwrap the response into Python structures, and finally back to the Python space, where we can send another database message and block the entire application and all greenlets while this single message proceeds.
To share a link I've already shared about a dozen times here, here's some tests under similar conditions which illustrate what that concurrency looks like: http://www.diamondtin.com/2014/sqlalchemy-gevent-mysql-python-drivers-comparison/. MySQLdb takes *20 times longer* to handle the work of 100 sessions than PyMySQL when it's inappropriately run under gevent, when there is modestly high concurrency happening. When I talk about moving to threads, this is not a "won't help or hurt" kind of issue, at the moment it's a change that will immediately allow massive improvement to the performance of all Openstack applications instantly. We need to change the DB library or dump eventlet.
As far as if we should dump eventlet or use a pure-Python DB library, my contention is that a thread based + C database library will outperform an eventlet + Python-based database library. Additionally, if we make either change, when we do so we may very well see all kinds of new database-concurrency related bugs in our apps too, because we will be talking to the database much more intensively all the sudden; it is my opinion that a traditional threading model will be an easier environment to handle working out the approach to these issues; we have to assume "concurrency at any time" in any case because we run multiple instances of Nova etc. at the same time. At the end of the day, we aren't going to see wildly better performance with one approach over the other in any case, so we should pick the one that is easier to develop, maintain, and keep stable.
Robert's analysis talks about various "at the limit" issues, but I was unable to reproduce these in my own testing, and we should be relying upon working tests to illustrate what performance characteristics actually pan out. My tests only dealt with psycopg2 and Postgresql for example; won't someone work with my tests and try to replicate with PyMySQL/eventlet vs. MySQL-Python/threads? We should be relying on testing to see what reality actually holds here. But more than that, first we need to fix the obviously broken thing about our DB access before we can claim anything is optimized at all, and after we do that, I don't think splitting hairs into threads vs. eventlet is really going to make that much of a difference performance-wise. We should go with what produces the most stable development and usage experience while allowing a high degree of concurrency.
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev