On 07/11/2014 08:04 AM, Ihar Hrachyshka wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
On 09/07/14 13:17, Ihar Hrachyshka wrote:
Hi all,
Multiple projects are suffering from db lock timeouts due to
deadlocks deep in mysqldb library that we use to interact with
mysql servers. In essence, the problem is due to missing eventlet
support in mysqldb module, meaning when a db lock is encountered,
the library does not yield to the next green thread, allowing other
threads to eventually unlock the grabbed lock, and instead it just
blocks the main thread, that eventually raises timeout exception
(OperationalError).
The failed operation is not retried, leaving failing request not
served. In Nova, there is a special retry mechanism for deadlocks,
though I think it's more a hack than a proper fix.
Neutron is one of the projects that suffer from those timeout
errors a lot. Partly it's due to lack of discipline in how we do
nested calls in l3_db and ml2_plugin code, but that's not something
to change in foreseeable future, so we need to find another
solution that is applicable for Juno. Ideally, the solution should
be applicable for Icehouse too to allow distributors to resolve
existing deadlocks without waiting for Juno.
We've had several discussions and attempts to introduce a solution
to the problem. Thanks to oslo.db guys, we now have more or less
clear view on the cause of the failures and how to easily fix them.
The solution is to switch mysqldb to something eventlet aware. The
best candidate is probably MySQL Connector module that is an
official MySQL client for Python and that shows some (preliminary)
good results in terms of performance.
I've made additional testing, creating 2000 networks in parallel (10
thread workers) for both drivers and comparing results.
With mysqldb: 215.81 sec
With mysql-connector: 88.66
~2.4 times performance boost, ok? ;)
That really doesn't tell me much.
Please remember that performance != scalability.
If you showed the test/benchmark code, that would be great. You need to
run your benchmarks at varying levels of concurrency and varying levels
of read/write ratios for the workers. Otherwise it's like looking at a a
single dot of paint on a painting. Without looking at the patterns of
throughput (performance) and concurrency/locking (scalability) with
various levels of workers and read/write ratios, you miss the whole picture.
Another thing to ensure is that you are using real *processes*, not
threads, so that you actually simulate a real OpenStack service like
Nova or Neutron, which are multi-plexed, not multi-threaded, and have a
greenlet pool within each worker process.
Best
-jay
I think we should switch to that library *even* if we forget about all
the nasty deadlocks we experience now.
I've posted a Neutron spec for the switch to the new client in Juno
at [1]. Ideally, switch is just a matter of several fixes to
oslo.db that would enable full support for the new driver already
supported by SQLAlchemy, plus 'connection' string modified in
service configuration files, plus documentation updates to refer to
the new official way to configure services for MySQL. The database
code won't, ideally, require any major changes, though some
adaptation for the new client library may be needed. That said,
Neutron does not seem to require any changes, though it was
revealed that there are some alembic migration rules in Keystone or
Glance that need (trivial) modifications.
You can see how trivial the switch can be achieved for a service
based on example for Neutron [2].
While this is a Neutron specific proposal, there is an obvious wish
to switch to the new library globally throughout all the projects,
to reduce devops burden, among other things. My vision is that,
ideally, we switch all projects to the new library in Juno, though
we still may leave several projects for K in case any issues arise,
similar to the way projects switched to oslo.messaging during two
cycles instead of one. Though looking at how easy Neutron can be
switched to the new library, I wouldn't expect any issues that
would postpone the switch till K.
It was mentioned in comments to the spec proposal that there were
some discussions at the latest summit around possible switch in
context of Nova that revealed some concerns, though they do not
seem to be documented anywhere. So if you know anything about it,
please comment.
So, we'd like to hear from other projects what's your take on that
move, whether you see any issues or have concerns about it.
Thanks for your comments, /Ihar
[1]: https://review.openstack.org/#/c/104905/ [2]:
https://review.openstack.org/#/c/105209/
_______________________________________________ OpenStack-dev
mailing list OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.22 (Darwin)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iQEcBAEBCgAGBQJTv9LHAAoJEC5aWaUY1u57d2cIAIAthLuM6qxN9fVjPwoICEae
oSOLvaDNPpZ+xBBqKI+2l5aFiBXSkHzgCfWGHEZB4e+5odAzt8r3Dg5eG/hwckGt
iZLPGLxcmvD5K0cRoSSPWkPC4KkOwKw0yQHl/JQarDcHQlLgO64jx3bzlB1LDxRu
R/Bvqo1SBo8g/cupWyxJXNViu9z7zAlvcHLRg4j/AfNTsTDZRrSgbMF2/gLTMvN2
FPtkjBvZq++zOva5G5/TySr1b3QRBFCG0uetVbcVF//90XOw+O++rUiDW1v7vkA9
OS2sCIXmx1i8kt9yuvs0h11MS8qfX9rSXREJXyPq6NDmePdQdKFsozMdTmqaDfU=
=JfiC
-----END PGP SIGNATURE-----
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev