On Dec 12, 2014, at 1:16 PM, Mike Bayer <mba...@redhat.com> wrote: > >> On Dec 12, 2014, at 9:27 AM, Ihar Hrachyshka <ihrac...@redhat.com> wrote: >> >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA512 >> >> Reading the latest comments at >> https://github.com/PyMySQL/PyMySQL/issues/275, it seems to me that the >> issue is not to be solved in drivers themselves but instead in >> libraries that arrange connections (sqlalchemy/oslo.db), correct? >> >> Will the proposed connection reopening help? > > disagree, this is absolutely a driver bug. I’ve re-read that last comment > and now I see that the developer is suggesting that this condition not be > flagged in any way, so I’ve responded. The connection should absolutely blow > up and if it wants to refuse to be usable afterwards, that’s fine (it’s the > same as MySQLdb “commands out of sync”). It just has to *not* emit any > further SQL as though nothing is wrong. > > It doesn’t matter much for PyMySQL anyway, I don’t know that PyMySQL is up to > par for openstack in any case (look at the entries in their changelog: > https://github.com/PyMySQL/PyMySQL/blob/master/CHANGELOG "Several other bug > fixes”, “Many bug fixes"- really? is this an iphone app?)
This does make me a little concerned about merging https://review.openstack.org/#/c/133962/ so I’ve added a -2 for the time being to let the discussion go on here. Doug > > We really should be looking to get this fixed in MySQL-connector, which seems > to have a similar issue. It’s just so difficult to get responses from > MySQL-connector that the PyMySQL thread is at least informative. > > > > > >> >> /Ihar >> >> On 05/12/14 23:43, Mike Bayer wrote: >>> Hey list - >>> >>> I’m posting this here just to get some ideas on what might be >>> happening here, as it may or may not have some impact on Openstack >>> if and when we move to MySQL drivers that are async-patchable, like >>> MySQL-connector or PyMySQL. I had a user post this issue a few >>> days ago which I’ve since distilled into test cases for PyMySQL and >>> MySQL-connector separately. It uses gevent, not eventlet, so I’m >>> not really sure if this applies. But there’s plenty of very smart >>> people here so if anyone can shed some light on what is actually >>> happening here, that would help. >>> >>> The program essentially illustrates code that performs several >>> steps upon a connection, however if the greenlet is suddenly >>> killed, the state from the connection, while damaged, is still >>> being allowed to continue on in some way, and what’s >>> super-catastrophic here is that you see a transaction actually >>> being committed *without* all the statements proceeding on it. >>> >>> In my work with MySQL drivers, I’ve noted for years that they are >>> all very, very bad at dealing with concurrency-related issues. The >>> whole “MySQL has gone away” and “commands out of sync” errors are >>> ones that we’ve all just drowned in, and so often these are due to >>> the driver getting mixed up due to concurrent use of a connection. >>> However this one seems more insidious. Though at the same time, >>> the script has some complexity happening (like a simplistic >>> connection pool) and I’m not really sure where the core of the >>> issue lies. >>> >>> The script is at >>> https://gist.github.com/zzzeek/d196fa91c40cb515365e and also below. >>> If you run it for a few seconds, go over to your MySQL command line >>> and run this query: >>> >>> SELECT * FROM table_b WHERE a_id not in (SELECT id FROM table_a) >>> ORDER BY a_id DESC; >>> >>> and what you’ll see is tons of rows in table_b where the “a_id” is >>> zero (because cursor.lastrowid fails), but the *rows are >>> committed*. If you read the segment of code that does this, it >>> should be impossible: >>> >>> connection = pool.get() rowid = execute_sql( connection, "INSERT >>> INTO table_a (data) VALUES (%s)", ("a",) ) >>> >>> gevent.sleep(random.random() * 0.2) try: execute_sql( connection, >>> "INSERT INTO table_b (a_id, data) VALUES (%s, %s)", (rowid, "b",) >>> ) connection.commit() pool.return_conn(connection) >>> >>> except Exception: connection.rollback() >>> pool.return_conn(connection) >>> >>> so if the gevent.sleep() throws a timeout error, somehow we are >>> getting thrown back in there, with the connection in an invalid >>> state, but not invalid enough to commit. >>> >>> If a simple check for “SELECT connection_id()” is added, this query >>> fails and the whole issue is prevented. Additionally, if you put a >>> foreign key constraint on that b_table.a_id, then the issue is >>> prevented, and you see that the constraint violation is happening >>> all over the place within the commit() call. The connection is >>> being used such that its state just started after the >>> gevent.sleep() call. >>> >>> Now, there’s also a very rudimental connection pool here. That is >>> also part of what’s going on. If i try to run without the pool, >>> the whole script just runs out of connections, fast, which suggests >>> that this gevent timeout cleans itself up very, very badly. >>> However, SQLAlchemy’s pool works a lot like this one, so if folks >>> here can tell me if the connection pool is doing something bad, >>> then that’s key, because I need to make a comparable change in >>> SQLAlchemy’s pool. Otherwise I worry our eventlet use could have >>> big problems under high load. >>> >>> >>> >>> >>> >>> # -*- coding: utf-8 -*- import gevent.monkey >>> gevent.monkey.patch_all() >>> >>> import collections import threading import time import random >>> import sys >>> >>> import logging logging.basicConfig() log = >>> logging.getLogger('foo') log.setLevel(logging.DEBUG) >>> >>> #import pymysql as dbapi from mysql import connector as dbapi >>> >>> >>> class SimplePool(object): def __init__(self): self.checkedin = >>> collections.deque([ self._connect() for i in range(50) ]) >>> self.checkout_lock = threading.Lock() self.checkin_lock = >>> threading.Lock() >>> >>> def _connect(self): return dbapi.connect( user="scott", >>> passwd="tiger", host="localhost", db="test") >>> >>> def get(self): with self.checkout_lock: while not self.checkedin: >>> time.sleep(.1) return self.checkedin.pop() >>> >>> def return_conn(self, conn): try: conn.rollback() except: >>> log.error("Exception during rollback", exc_info=True) try: >>> conn.close() except: log.error("Exception during close", >>> exc_info=True) >>> >>> # recycle to a new connection conn = self._connect() with >>> self.checkin_lock: self.checkedin.append(conn) >>> >>> >>> def verify_connection_id(conn): cursor = conn.cursor() try: >>> cursor.execute("select connection_id()") row = cursor.fetchone() >>> return row[0] except: return None finally: cursor.close() >>> >>> >>> def execute_sql(conn, sql, params=()): cursor = conn.cursor() >>> cursor.execute(sql, params) lastrowid = cursor.lastrowid >>> cursor.close() return lastrowid >>> >>> >>> pool = SimplePool() >>> >>> # SELECT * FROM table_b WHERE a_id not in # (SELECT id FROM >>> table_a) ORDER BY a_id DESC; >>> >>> PREPARE_SQL = [ "DROP TABLE IF EXISTS table_b", "DROP TABLE IF >>> EXISTS table_a", """CREATE TABLE table_a ( id INT NOT NULL >>> AUTO_INCREMENT, data VARCHAR (256) NOT NULL, PRIMARY KEY (id) ) >>> engine='InnoDB'""", """CREATE TABLE table_b ( id INT NOT NULL >>> AUTO_INCREMENT, a_id INT NOT NULL, data VARCHAR (256) NOT NULL, -- >>> uncomment this to illustrate where the driver is attempting -- to >>> INSERT the row during ROLLBACK -- FOREIGN KEY (a_id) REFERENCES >>> table_a(id), PRIMARY KEY (id) ) engine='InnoDB' """] >>> >>> connection = pool.get() for sql in PREPARE_SQL: >>> execute_sql(connection, sql) connection.commit() >>> pool.return_conn(connection) print("Table prepared...") >>> >>> >>> def transaction_kill_worker(): while True: try: connection = None >>> with gevent.Timeout(0.1): connection = pool.get() rowid = >>> execute_sql( connection, "INSERT INTO table_a (data) VALUES (%s)", >>> ("a",)) gevent.sleep(random.random() * 0.2) >>> >>> try: execute_sql( connection, "INSERT INTO table_b (a_id, data) >>> VALUES (%s, %s)", (rowid, "b",)) >>> >>> # this version prevents the commit from # proceeding on a bad >>> connection # if verify_connection_id(connection): # >>> connection.commit() >>> >>> # this version does not. It will commit the # row for table_b >>> without the table_a being present. connection.commit() >>> >>> pool.return_conn(connection) except Exception: >>> connection.rollback() pool.return_conn(connection) >>> sys.stdout.write("$") except gevent.Timeout: # try to return the >>> connection anyway if connection is not None: >>> pool.return_conn(connection) sys.stdout.write("#") except >>> Exception: # logger.exception(e) sys.stdout.write("@") else: >>> sys.stdout.write(".") finally: if connection is not None: >>> pool.return_conn(connection) >>> >>> >>> def main(): for i in range(50): >>> gevent.spawn(transaction_kill_worker) >>> >>> gevent.sleep(3) >>> >>> while True: gevent.sleep(5) >>> >>> >>> if __name__ == "__main__": main() >>> >>> >>> >>> >>> >>> >>> _______________________________________________ OpenStack-dev >>> mailing list OpenStack-dev@lists.openstack.org >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>> >> -----BEGIN PGP SIGNATURE----- >> Version: GnuPG/MacGPG2 v2.0.22 (Darwin) >> >> iQEcBAEBCgAGBQJUivtJAAoJEC5aWaUY1u57nn8IAJP5zK/htG8EeoOSWZVV1ksA >> en+lQuIA09aNdkHSNS1b/lNOYwsF4X5SM0dU5Cs4LCsumC5jM9S/cNOn3sfpVooA >> vd31O1kKtd255YtnsKSmrOPiytoI69n2/65tVqgWLHpuXRSaj4HtqOEY/vOWMX6g >> BON50QUYwwxAZLNOPmEO7vUnJ3VYO6zquH2mQrA1Vg/LCm3+VaodEHOVCxieaJ/n >> iQPB4Vx1dkuP10HzWyjQW0j4kbUakqgkq/VHaiCYNC85HzPz6KJUOK/neZcBrWsZ >> RQcLae1dX1yGMXDd5hyJaoe3qUfjuvSZmV5jS3ok/x8rnKdmVl65PtUUlLfVOU0= >> =wJ55 >> -----END PGP SIGNATURE----- >> >> _______________________________________________ >> OpenStack-dev mailing list >> OpenStack-dev@lists.openstack.org >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev _______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev