On 5/11/15 9:17 PM, Robert Collins wrote:
On 12 May 2015 at 10:44, Mike Bayer <mba...@redhat.com> wrote:

What we have today in our standard architecture for OpenStack is
optimised for IO bound workloads: waiting on the
network/subprocesses/disk/libvirt etc. Running high numbers of
eventlet handlers in a single process only works when the majority of
the work being done by a handler is IO.

Everything stated here is great, however in our situation there is one
unfortunate fact which renders it completely incorrect at the moment.   I'm
still puzzled why we are getting into deep think sessions about the vagaries
of the GIL and async when there is essentially a full-on red-alert
performance blocker rendering all of this discussion useless, so I must
again remind us: what we have *today* in Openstack is *as completely
un-optimized as you can possibly be*.
Sorry if I seems like I went on a tangent, but choosing a concurrency
model in Python, which a lot of this discussion has been about, is
inextricably linked to the workload being tackled. The point of my
tl;dr was that using threads - which gets us out of the pit below - is
fine for most of our workloads and irrelevant to the actual issues in
the other ones. Clearly that didn't come across. - Sorry.
Robert -

Other people noted my fast takeoff as well so i think I saw "GIL" and lots of thoughtful calculations and after that, my reading comprehension is dulled by the fog of my own angst :). I'll try to slow down more next time.


The most GIL-heavy nightmare CPU bound task you can imagine running on 25
threads on a ten year old Pentium will run better than the Openstack we have
today, because we are running a C-based, non-eventlet patched DB library
within a single OS thread that happens to use eventlet, but the use of
eventlet is totally pointless because right now it blocks completely on all
database IO.
To confirm my understanding: this library releases the GIL, but
because we only have one thread, we don't get more work done.

Yes, that sucks. And your tl;dr is that we need to either use an
eventlet ready library or not use eventlet's greenthreads, either of
which I support as a short term rectification.
yes, the GIL is released within the MySQLdb C routines that are primarily focused on IO here.



Robert's analysis talks about various "at the limit" issues,  but I was
They tend to turn up at scale. You get 100 requests a day out of 5
million that are inexplicably slow, and eventually you have enough
data around the situation to try an experiment, and lo and behold the
problem goes away. They don't disagree with the argument you're making
though - this is just the bigger context, when folk go to deploy our
(real threads || eventlet friendly DB library) code, how many
processes will they need?
It's been pointed out separately that Openstack already uses a lot of processes, and even now with our serialized DB access per-process we still achieve concurrency through this. So by all means, let's keep using processes, that is always a good thing although it does present the challenge that we have a lot of DB connections opened as a result (because we use pooling).



FWIW, I think moving to an eventlet friendly library should be the
first step because it can be done much more rapidly and with arguably
less risk.
Yes I'm not really sure why we aren't just changing "mysql+mysqldb://" to "mysql+pymysql://" in our config files right now. Because this would also solve the Py3K issue for the time being.



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to