On 25/02/15 11:51, Radoslav Gerganov wrote: > On 02/23/2015 03:18 PM, Matthew Booth wrote: >> On 23/02/15 12:13, Gary Kotton wrote: >>> >>> >>> On 2/23/15, 2:05 PM, "Matthew Booth" <mbo...@redhat.com> wrote: >>> >>>> On 20/02/15 11:48, Matthew Booth wrote: >>>>> Gary Kotton came across a doozy of a bug recently: >>>>> >>>>> https://bugs.launchpad.net/nova/+bug/1419785 >>>>> >>>>> In short, when you start a Nova compute, it will query the driver for >>>>> instances and compare that against the expected host of the the >>>>> instance >>>>> according to the DB. If the driver is reporting an instance the DB >>>>> thinks is on a different host, it assumes the instance was evacuated >>>>> while Nova compute was down, and deletes it on the hypervisor. >>>>> However, >>>>> Gary found that you trigger this when starting up a backup HA node >>>>> which >>>>> has a different `host` config setting. i.e. You fail over, and the >>>>> first >>>>> thing it does is delete all your instances. >>>>> >>>>> Gary and I both agree on a couple of things: >>>>> >>>>> 1. Deleting all your instances is bad >>>>> 2. HA nova compute is highly desirable for some drivers >>>>> >>>>> We disagree on the approach to fixing it, though. Gary posted this: >>>>> >>>>> https://review.openstack.org/#/c/154029/ >>>>> >>>>> I've already outlined my objections to this approach elsewhere, but to >>>>> summarise I think this fixes 1 symptom of a design problem, and leaves >>>>> the rest untouched. If the value of nova compute's `host` changes, >>>>> then >>>>> the assumption that instances associated with that compute can be >>>>> identified by the value of instance.host becomes invalid. This >>>>> assumption is pervasive, so it breaks a lot of stuff. The worst one is >>>>> _destroy_evacuated_instances(), which Gary found, but if you scan >>>>> nova/compute/manager for the string 'self.host' you'll find lots of >>>>> them. For example, all the periodic tasks are broken, including image >>>>> cache management, and the state of ResourceTracker will be unusual. >>>>> Worse, whenever a new instance is created it will have a different >>>>> value >>>>> of instance.host, so instances running on a single hypervisor will >>>>> become partitioned based on which nova compute was used to create >>>>> them. >>>>> >>>>> In short, the system may appear to function superficially, but it's >>>>> unsupportable. >>>>> >>>>> I had an alternative idea. The current assumption is that the `host` >>>>> managing a single hypervisor never changes. If we break that >>>>> assumption, >>>>> we break Nova, so we could assert it at startup and refuse to start if >>>>> it's violated. I posted this VMware-specific POC: >>>>> >>>>> https://review.openstack.org/#/c/154907/ >>>>> >>>>> However, I think I've had a better idea. Nova creates ComputeNode >>>>> objects for its current configuration at startup which, amongst other >>>>> things, are a map of host:hypervisor_hostname. We could assert when >>>>> creating a ComputeNode that hypervisor_hostname is not already >>>>> associated with a different host, and refuse to start if it is. We >>>>> would >>>>> give an appropriate error message explaining that this is a >>>>> misconfiguration. This would prevent the user from hitting any of the >>>>> associated problems, including the deletion of all their instances. >>>> >>>> I have posted a patch implementing the above for review here: >>>> >>>> https://review.openstack.org/#/c/158269/ >>> >>> I have to look at what you have posted. I think that this topic is >>> something that we should speak about at the summit and this should fall >>> under some BP and well defined spec. I really would not like to see >>> existing installations being broken if and when this patch lands. It may >>> also affect Ironic as it works on the same model. >> >> This patch will only affect installations configured with multiple >> compute hosts for a single hypervisor. These are already broken, so this >> patch will at least let them know if they haven't already noticed. >> >> It won't affect Ironic, because they configure all compute hosts to have >> the same 'host' value. An Ironic user would only notice this patch if >> they accidentally misconfigured it, which is the intended behaviour. >> >> Incidentally, I also support more focus on the design here. Until we >> come up with a better design, though, we need to do our best to prevent >> non-trivial corruption from a trivial misconfiguration. I think we need >> to merge this, or something like it, now and still have a summit >> discussion. >> >> Matt >> > > Hi Matt, > > I already posted a comment on your patch but I'd like to reiterate here > as well. Currently the VMware driver is using the cluster name as > hypervisor_hostname which is a problem because you can have different > clusters with the same name. We already have a critical bug filed for > this: > > https://bugs.launchpad.net/nova/+bug/1329261 > > There was an attempt to fix this by using a combination of vCenter UUID > + cluster_name but it was rejected because this combination was not > considered a 'real' hostname. I think that if we go for a DB schema > change we can fix both issues by renaming hypervisor_hostname to > hypervisor_id and make it unique. What do you think?
Well, I think hypervisor_id makes more sense than hypervisor_hostname. The latter is a confusing. However, I'd prefer not to complicate this change with it. I'm pessimistic enough as it is with its current scope. Re the cluster name change, I assume you're referring to this change: https://review.openstack.org/#/c/99623/ I have to say I don't agree with the reasoning behind the rejection. The only thing the API-layer is going to be able to do is check if it's got dots in it, and it would only pass for Ironic's uuids coincidentally. Still, it's a trivial change to that patch to make it pass, so we should just do it. I think this issue is orthogonal to my patch, though, because it's already unintentional and broken. Matt -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev