Hi Eugene, please see comments inline. But, bottom line, is that setting the transaction isolation level to READ_COMMITTED should be avoided.

On 11/18/2014 01:38 PM, Eugene Nikanorov wrote:
Hi neutron folks,

There is an ongoing effort to refactor some neutron DB logic to be
compatible with galera/mysql which doesn't support locking
(with_lockmode('update')).

Some code paths that used locking in the past were rewritten to retry
the operation if they detect that an object was modified concurrently.
The problem here is that all DB operations (CRUD) are performed in the
scope of some transaction that makes complex operations to be executed
in atomic manner.

Yes. The root of the problem in Neutron is that the session object is passed through all of the various plugin methods and the session.begin(subtransactions=True) is used all over the place, when in reality many things should not need to be done in long-lived transactional containers.

For mysql the default transaction isolation level is 'REPEATABLE READ'
which means that once the code issue a query within a transaction, this
query will return the same result while in this transaction (e.g. the
snapshot is taken by the DB during the first query and then reused for
the same query).

Correct.

However note that the default isolation level in PostgreSQL is READ COMMITTED, though it is important to point out that PostgreSQL's READ COMMITTED isolation level does *NOT* allow one session to see changes committed during query execution by concurrent transactions.

It is a common misunderstanding that MySQL's READ COMMITTED isolation level is the same as PostgreSQL's READ COMMITTED isolation level. It is not. PostgreSQL's READ COMMITTED isolation level is actually most closely similar to MySQL's REPEATABLE READ isolation level.

I bring this up because the proposed solution of setting the isolation level to READ COMMITTED will not work like you think it will on PostgreSQL. Regardless, please see below as to why setting the isolation level to READ COMMITTED is not the appropriate solution to this problem anyway...

In other words, the retry logic like the following will not work:

def allocate_obj():
     with session.begin(subtrans=True):
          for i in xrange(n_retries):
               obj = session.query(Model).filter_by(filters)
               count = session.query(Model).filter_by(id=obj.id
<http://obj.id>).update({'allocated': True})
               if count:
                    return obj

since usually methods like allocate_obj() is called from within another
transaction, we can't simply put transaction under 'for' loop to fix the
issue.

Exactly. The above code, from here:

https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py#L98

has no chance of working at all under the existing default isolation levels for either MySQL or PostgreSQL. If another session updates the same row in between the time the first session began and the UPDATE statement in the first session starts, then the first session will return 0 rows affected. It will continue to return 0 rows affected for each loop, as long as the same transaction/session is still in effect, which in the code above, is the case.

The particular issue here is
https://bugs.launchpad.net/neutron/+bug/1382064 with the proposed fix:
https://review.openstack.org/#/c/129288

So far the solution proven by tests is to change transaction isolation
level for mysql to be 'READ COMMITTED'.
The patch suggests changing the level for particular transaction where
issue occurs (per sqlalchemy, it will be reverted to engine default once
transaction is committed)
This isolation level allows the code above to see different result in
each iteration.

Not for PostgreSQL, see above. You would need to set the level to READ *UNCOMMITTED* to get that behaviour for PostgreSQL, and setting to READ UNCOMMITTED is opening up the code to a variety of other issues and should be avoided.

At the same time, any code that relies that repeated query under same
transaction gives the same result may potentially break.

So the question is: what do you think about changing the default
isolation level to READ COMMITTED for mysql project-wise?
It is already so for postgress, however we don't have much concurrent
test coverage to guarantee that it's safe to move to a weaker isolation
level.

PostgreSQL READ COMMITTED is the same as MySQL's REPEATABLE READ. :) So, no, it doesn't work for PostgreSQL either.

The design of the Neutron plugin code's interaction with the SQLAlchemy session object is the main problem here. Instead of doing all of this within a single transactional container, the code should instead be changed to perform the SELECT statements in separate transactions/sessions.

That means not using the session parameter supplied to the neutron.plugins.ml2.drivers.helpers.TypeDriverHelper.allocate_partially_specified_segment() method, and instead performing the SQL statements in separate transactions.

Mike Bayer's EngineFacade blueprint work should hopefully unclutter the current passing of a session object everywhere, but until that hits, it should be easy enough to simply ensure that you don't use the same session object over and over again, instead of changing the isolation level.

All the best,
-jay

Your feedback is appreciated.

Thanks,
Eugene.



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to