Deletes should only be allowed when the vm is in a power off state. This will allow consistent state transition.
Thanks Al On Oct 26, 2013, at 8:55 AM, Joshua Harlow <[email protected]> wrote: > I think I will try to have a unconference at the HK summit about ideas the > cinder developers (and the taskflow developers, since it's not a concept that > is unique /applicable to just cinder) are having about said state machine > (and it's potential usage). > > So look out for that, be interesting to have some nova folks involved there > also :-) > > Sent from my really tiny device... > > On Oct 26, 2013, at 3:14 AM, "Alex Glikson" <[email protected]> wrote: > >> +1 >> >> Regards, >> Alex >> >> >> Joshua Harlow <[email protected]> wrote on 26/10/2013 09:29:03 AM: >> > >> > An idea that others and I are having for a similar use case in >> > cinder (or it appears to be similar). >> > >> > If there was a well defined state machine/s in nova with well >> > defined and managed transitions between states then it seems like >> > this state machine could resume on failure as well as be interrupted >> > when a "dueling" or preemptable operation arrives (a delete while >> > being created for example). This way not only would it be very clear >> > the set of states and transitions but it would also be clear how >> > preemption occurs (and under what cases). >> > >> > Right now in nova there is a distributed and ad-hoc state machine >> > which if it was more formalized it could inherit some if the >> > described useful capabilities. It would also be much more resilient >> > to these types of locking problems that u described. >> > >> > IMHO that's the only way these types of problems will be fully be >> > fixed, not by more queues or more periodic tasks, but by solidifying >> > & formalizing the state machines that compose the work nova does. >> > >> > Sent from my really tiny device... >> > >> > > On Oct 25, 2013, at 3:52 AM, "Day, Phil" <[email protected]> wrote: >> > > >> > > Hi Folks, >> > > >> > > We're very occasionally seeing problems where a thread processing >> > a create hangs (and we've seen when taking to Cinder and Glance). >> > Whilst those issues need to be hunted down in their own rights, they >> > do show up what seems to me to be a weakness in the processing of >> > delete requests that I'd like to get some feedback on. >> > > >> > > Delete is the one operation that is allowed regardless of the >> > Instance state (since it's a one-way operation, and users should >> > always be able to free up their quota). However when we get a >> > create thread hung in one of these states, the delete requests when >> > they hit the manager will also block as they are synchronized on the >> > uuid. Because the user making the delete request doesn't see >> > anything happen they tend to submit more delete requests. The >> > Service is still up, so these go to the computer manager as well, >> > and eventually all of the threads will be waiting for the lock, and >> > the compute manager will stop consuming new messages. >> > > >> > > The problem isn't limited to deletes - although in most cases the >> > change of state in the API means that you have to keep making >> > different calls to get past the state checker logic to do it with an >> > instance stuck in another state. Users also seem to be more >> > impatient with deletes, as they are trying to free up quota for other >> > things. >> > > >> > > So while I know that we should never get a thread into a hung >> > state into the first place, I was wondering about one of the >> > following approaches to address just the delete case: >> > > >> > > i) Change the delete call on the manager so it doesn't wait for >> > the uuid lock. Deletes should be coded so that they work regardless >> > of the state of the VM, and other actions should be able to cope >> > with a delete being performed from under them. There is of course >> > no guarantee that the delete itself won't block as well. >> > > >> > > ii) Record in the API server that a delete has been started (maybe >> > enough to use the task state being set to DELETEING in the API if >> > we're sure this doesn't get cleared), and add a periodic task in the >> > compute manager to check for and delete instances that are in a >> > "DELETING" state for more than some timeout. Then the API, knowing >> > that the delete will be processes eventually can just no-op any >> > further delete requests. >> > > >> > > iii) Add some hook into the ServiceGroup API so that the timer >> > could depend on getting a free thread from the compute manager pool >> > (ie run some no-op task) - so that of there are no free threads then >> > the service becomes down. That would (eventually) stop the scheduler >> > from sending new requests to it, and make deleted be processed in >> > the API server but won't of course help with commands for other >> > instances on the same host. >> > > >> > > iv) Move away from having a general topic and thread pool for all >> > requests, and start a listener on an instance specific topic for >> > each running instance on a host (leaving the general topic and pool >> > just for creates and other non-instance calls like the hypervisor >> > API). Then a blocked task would only affect request for a >> > specificinstance. >> > > >> > > I'm tending towards ii) as a simple and pragmatic solution in the >> > near term, although I like both iii) and iv) as being both generally >> > good enhancments - but iv) in particular feels like a pretty seismic >> > change. >> > > >> > > Thoughts please, >> > > >> > > Phil >> > > >> > > _______________________________________________ >> > > OpenStack-dev mailing list >> > > [email protected] >> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> > _______________________________________________ >> > OpenStack-dev mailing list >> > [email protected] >> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > >> _______________________________________________ >> OpenStack-dev mailing list >> [email protected] >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > _______________________________________________ > OpenStack-dev mailing list > [email protected] > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
_______________________________________________ OpenStack-dev mailing list [email protected] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
