Please see inline. cheers,
Rossella On 05/20/2014 12:26 AM, Salvatore Orlando wrote: > Some comments inline. > > Salvatore > > > On 19 May 2014 20:32, sridhar basam <sridhar.ba...@gmail.com > <mailto:sridhar.ba...@gmail.com>> wrote: > > > > > On Mon, May 19, 2014 at 1:30 PM, Jay Pipes <jaypi...@gmail.com > <mailto:jaypi...@gmail.com>> wrote: > > Stackers, > > On Friday in Atlanta, I had the pleasure of moderating the > database session at the Ops Meetup track. We had lots of good > discussions and heard important feedback from operators on DB > topics. > > For the record, I would not bring this point up so publicly > unless I believed it was a serious problem affecting a large > segment of users. When doing an informal survey of the > users/operators in the room at the start of the session, out > of approximately 200 people in the room, only a single person > was using PostgreSQL, about a dozen were using standard MySQL > master/slave replication, and the rest were using MySQL Galera > clustering. So, this is a real issue for a large segment of > the operators -- or at least the ones at the session. :) > > > We are one of those operators that use Galera for replicating our > mysql databases. We used to see issues with deadlocks when having > multiple mysql writers in our mysql cluster. As a workaround we > have our haproxy configuration in an active-standby configuration > for our mysql VIP. > > I seem to recall we had a lot of the deadlocks happen through > Neutron. When we go through our Icehouse testing, we will redo our > multimaster mysql setup and provide feedback on the issues we see. > > > The SELECT... FOR UPDATE issue is going to be a non trivial one for > neutron as well. Some components, like IPAM, heavily rely on it. > However, Neutron is a lot more susceptible to deadlock problems than > nova because it does not implement at the moment a retry mechanism. > This is something which should be added during the Juno release cycle > regardless of all the other enhancement currently being planned, such > as task oriented operations. > > > thanks, > Sridhar > > > > Peter Boros, from Percona, was able to provide some insight on > MySQL Galera topics, and one issue came up that is likely the > cause of a lot of heartache for operators who use MySQL Galera > (or Percona XtraDB Cluster). > > We were discussing whether people had seen deadlock issues [1] > when using MySQL Galera in their deployment, and were > brainstorming on why deadlocks might be seen. I had suggested > that perhaps Nova's use of autoincrementing primary keys may > have been the cause. Peter pretty quickly dispatched that > notion, saying that Galera automatically handles > autoincrementing keys using managed > innodb_autoincrement_increment and innodb_autoincrement_offset > config options. > > I think at that point I mentioned that there were a number of > places that were using the SELECT ... FOR UPDATE construct in > Nova (in SQLAlchemy, it's the with_lockmode('update') > modification of the query object). Peter promptly said that > was a problem. MySQL Galera does not support SELECT ... FOR > UPDATE, since it has no concept of cross-node locking of > records and results are non-deterministic. > > So... what to do? > > For starters, some information on the use of with_lockmode() > in Nova and Neutron... > > Within Nova, there are actually only a few places where > with_lockmode('update') is used. Unfortunately, the use of > with_lockmode('update') is in the quota code, which tends to > wrap largish blocks of code within the Nova compute execution > code. > > Within Neutron, however, the use of with_lockmode('update') is > all over the place. There are 44 separate uses of it in 11 > different files. > > > I will report on a separate thread on this, so that we can have an > assessment of where locking statements are used and why. > > > We have a number of options: > > > I thin option 0 should be to rework/redesign the code, where possible, > to avoid DB-level locking at all. I totally agree. Is anybody already coordinating this rework? I'd like to help. After redesigning, it is gonna be easier to make a decision regarding a distributed lock manager. > > > > 1) Stop using MySQL Galera for databases of projects that > contain with_lockmode('update') > > > This looks hideous, but I am afraid this is what all people wishing to > deploy Icehouse should consider doing. > > > > 2) Put a big old warning in the docs somewhere about the > problem of potential deadlocks or odd behaviour with Galera in > these projects > > 3) For Nova and Neutron, remove the use of > with_lockmode('update') and instead use a coarse-grained file > lock or a distributed lock manager for those areas where we > need deterministic reads or quiescence. > > > We had an attempt at implementing a sort of distributed lock for > neutron: https://review.openstack.org/#/c/34695/ > Beyond the implementation reservations on this patch, one thing that > should be noticed, probably needless to say, is that distributed > coordination is something that should never be taken in a > light-hearted way. > Once all the non-locking solution have been ruled out, distributed > coordination among processes could be considered. In that case I think > it might be better to use some OTS software rather than working out > some home grown solution (I surely do not see space for a new project > here) > On a side note, I'm rather ignorant on python frameworks for > distributed coordination... concoord? Is zookeper something that > should be ruled out because of language restrictions? > > > > > 4) For the Nova db quota driver, refactor the driver to either > use a non-locking method for reservation and quota queries or > move the driver out into its own projects (or use something > like Climate and make sure that Climate uses a non-blocking > algorithm for those queries...) > > Thoughts? > > -jay > > [1] > http://lists.openstack.org/pipermail/openstack/2014-May/007202.html > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > <mailto:OpenStack-dev@lists.openstack.org> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > <mailto:OpenStack-dev@lists.openstack.org> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > _______________________________________________ > OpenStack-dev mailing list > OpenStack-dev@lists.openstack.org > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev