Excerpts from Zane Bitter's message of 2015-01-09 14:57:21 -0800: > On 08/01/15 05:39, Anant Patil wrote: > > 1. The stack was failing when there were single disjoint resources or > > just one resource in template. The graph did not include this resource > > due to a minor bug in dependency_names(). I have added a test case and > > fix here: > > https://github.com/anantpatil/heat-convergence-prototype/commit/b58abd77cf596475ecf3f19ed38adf8ad3bb6b3b > > Thanks, sorry about that! I will push a patch to fix it up. > > > 2. The resource graph is created with keys in both forward order > > traversal and reverse order traversal and the update will finish the > > forward order and attempt the reverse order. If this is the case, then > > the update-replaced resources will be deleted before the update is > > complete and if the update fails, the old resource is not available for > > roll-back; a new resource has to be created then. I have added a test > > case at the above mentioned location. > > > > In our PoC, the updates (concurrent updates) won't remove a > > update-replaced resource until all the resources are updated, and > > resource clean-up phase is started. > > Hmmm, this is a really interesting question actually. That's certainly > not how Heat works at the moment; we've always assumed that rollback is > "best-effort" at recovering the exact resources you had before. It would > be great to have users weigh in on how they expect this to behave. I'm > curious now what CloudFormation does. > > I'm reluctant to change it though because I'm pretty sure this is > definitely *not* how you would want e.g. a rolling update of an > autoscaling group to happen. > > > It is unacceptable to remove the old > > resource to be rolled-back to since it may have changes which the user > > doesn't want to loose; > > If they didn't want to lose it they shouldn't have tried an update that > would replace it. If an update causes a replacement or an interruption > to service then I consider the same fair game for the rollback - the > user has already given us permission for that kind of change. (Whether > the user's consent was informed is a separate question, addressed by > Ryan's update-preview work.) >
In the original vision we had for using scaled groups to manage, say, nova-compute nodes, you definitely can't "create" new servers, so you can't just create all the new instances without de-allocating some. That said, thats why we are using in-place methods like rebuild. I think it would be acceptable to have cleanup run asynchronously, and to have rollback re-create anything that has already been cleaned up. __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev