On 5/11/15 9:58 AM, Attila Fazekas wrote:



----- Original Message -----
From: "John Garbutt" <j...@johngarbutt.com>
To: "OpenStack Development Mailing List (not for usage questions)" 
<openstack-dev@lists.openstack.org>
Cc: "Dan Smith" <d...@danplanet.com>
Sent: Saturday, May 9, 2015 12:45:26 PM
Subject: Re: [openstack-dev] [all] Replace mysql-python with mysqlclient

On 30 April 2015 at 18:54, Mike Bayer <mba...@redhat.com> wrote:
On 4/30/15 11:16 AM, Dan Smith wrote:
There is an open discussion to replace mysql-python with PyMySQL, but
PyMySQL has worse performance:

https://wiki.openstack.org/wiki/PyMySQL_evaluation
My major concern with not moving to something different (i.e. not based
on the C library) is the threading problem. Especially as we move in the
direction of cellsv2 in nova, not blocking the process while waiting for
a reply from mysql is going to be critical. Further, I think that we're
likely to get back a lot of performance from a supports-eventlet
database connection because of the parallelism that conductor currently
can only provide in exchange for the footprint of forking into lots of
workers.

If we're going to move, shouldn't we be looking at something that
supports our threading model?
yes, but at the same time, we should change our threading model at the
level
of where APIs are accessed to refer to a database, at the very least using
a
threadpool behind eventlet.   CRUD-oriented database access is faster using
traditional threads, even in Python, than using an eventlet-like system or
using explicit async.  The tests at
http://techspot.zzzeek.org/2015/02/15/asynchronous-python-and-databases/
show this.    With traditional threads, we can stay on the C-based MySQL
APIs and take full advantage of their speed.
Sorry to go back in time, I wanted to go back to an important point.

It seems we have three possible approaches:
* C lib and eventlet, blocks whole process
* pure python lib, and eventlet, eventlet does its thing
* go for a C lib and dispatch calls via thread pool
* go with pure C protocol lib, which explicitly using `python patch-able`
   I/O function (Maybe others like.: threading, mutex, sleep ..)

* go with pure C protocol lib and the python part explicitly call
   for `decode` and `encode`, the C part just do CPU intensive operations,
   and it never calls for I/O primitives .

We have a few problems:
* performance sucks, we have to fork lots of nova-conductors and api nodes
* need to support python2.7 and 3.4, but its not currently possible
with the lib we use?
* want to pick a lib that we can fix when there are issues, and work to
improve

It sounds like:
* currently do the first one, it sucks, forking nova-conductor helps
* seems we are thinking the second one might work, we sure get py3.4 +
py2.7 support
* the last will mean more work, but its likely to be more performant
* worried we are picking a unsupported lib with little future

I am leaning towards us moving to making DB calls with a thread pool
and some fast C based library, so we get the 'best' performance.

Is that a crazy thing to be thinking? What am I missing here?
Using the python socket from C code:
https://github.com/esnme/ultramysql/blob/master/python/io_cpython.c#L100

Also possible to implement a mysql driver just as a protocol parser,
and you are free to use you favorite event based I/O strategy (direct epoll 
usage)
even without eventlet (or similar).

The issue with ultramysql, it does not implements
the `standard` python DB API, so you would need to add an extra wrapper to 
SQLAlchemy.

This driver appears to have seen its last commit about a year ago, that doesn't even implement the standard DBAPI (which is already a red flag). There is apparently a separately released (!) DBAPI-compat wrapper https://pypi.python.org/pypi/umysqldb/1.0.3 which has had no releases in two years. If this wrapper is indeed compatible with MySQLdb then it would run in SQLAlchemy without changes (though I'd be extremely surprised if it passes our test suite).

How would using these obscure libraries be any preferable than running Nova API functions within the thread-pooling facilities already included with eventlet ? Keeping in mind that I've now done the work [1] to show that there is no performance gain to be had for all the trouble we go through to use eventlet/gevent/asyncio with local database connections.

[1] http://techspot.zzzeek.org/2015/02/15/asynchronous-python-and-databases/







__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to