On 8/25/15, 2:43 PM, "Andrew Laski" <and...@lascii.com> wrote:
>On 08/25/15 at 06:08pm, Gary Kotton wrote: >> >> >>On 8/25/15, 9:10 AM, "Matt Riedemann" <mrie...@linux.vnet.ibm.com> wrote: >> >>> >>> >>>On 8/25/2015 10:03 AM, Gary Kotton wrote: >>>> >>>> >>>> On 8/25/15, 7:04 AM, "Matt Riedemann" <mrie...@linux.vnet.ibm.com> >>>>wrote: >>>> >>>>> >>>>> >>>>> On 8/24/2015 9:32 PM, Gary Kotton wrote: >>>>>> In item #2 below the reboot is down via the guest and not the nova >>>>>> api¹s :) >>>>>> >>>>>> From: Gary Kotton <gkot...@vmware.com <mailto:gkot...@vmware.com>> >>>>>> Reply-To: OpenStack List <openstack-dev@lists.openstack.org >>>>>> <mailto:openstack-dev@lists.openstack.org>> >>>>>> Date: Monday, August 24, 2015 at 7:18 PM >>>>>> To: OpenStack List <openstack-dev@lists.openstack.org >>>>>> <mailto:openstack-dev@lists.openstack.org>> >>>>>> Subject: [openstack-dev] [nova] periodic task >>>>>> >>>>>> Hi, >>>>>> A couple of months ago I posted a patch for bug >>>>>> https://launchpad.net/bugs/1463688. The issue is as follows: the >>>>>> periodic task detects that the instance state does not match the >>>>>>state >>>>>> on the hypervisor and it shuts down the running VM. There are a >>>>>>number >>>>>> of ways that this may happen and I will try and explain: >>>>>> >>>>>> 1. Vmware driver example: a host where the instances are running >>>>>>goes >>>>>> down. This could be a power outage, host failure, etc. The >>>>>>first >>>>>> iteration of the perdioc task will determine that the actual >>>>>> instacne is down. This will update the state of the instance to >>>>>> DOWN. The VC has the ability to do HA and it will start the >>>>>>instance >>>>>> up and running again. The next iteration of the periodic task >>>>>>will >>>>>> determine that the instance is up and the compute manager will >>>>>>stop >>>>>> the instance. >>>>>> 2. All drivers. The tenant decides to do a reboot of the instance >>>>>>and >>>>>> that coincides with the periodic task state validation. At this >>>>>> point in time the instance will not be up and the compute node >>>>>>will >>>>>> update the state of the instance as DWON. Next iteration the >>>>>>states >>>>>> will differ and the instance will be shutdown >>>>>> >>>>>> Basically the issue hit us with our CI and there was no CI running >>>>>>for a >>>>>> couple of hours due to the fact that the compute node decided to >>>>>> shutdown the running instances. The hypervisor should be the source >>>>>>of >>>>>> truth and it should not be the compute node that decides to shutdown >>>>>> instances. I posted a patch to deal with this >>>>>> https://review.openstack.org/#/c/190047/. Which is the reason for >>>>>>this >>>>>> mail. The patch is backwards compatible so that the existing >>>>>>deployments >>>>>> and random shutdown continues as it works today and the admin now >>>>>>has >>>>>>an >>>>>> ability just to do a log if there is a inconsistency. >>>>>> >>>>>> We do not want to disable the periodic task as knowing the current >>>>>>state >>>>>> of the instance is very important and has a ton of value, we just do >>>>>>not >>>>>> want the periodic to task to shut down a running instance. >>>>>> >>>>>> Thanks >>>>>> Gary >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>_____________________________________________________________________ >>>>>>__ >>>>>>__ >>>>>> _ >>>>>> OpenStack Development Mailing List (not for usage questions) >>>>>> Unsubscribe: >>>>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>>>> >>>>> >>>>> In #2 the guest shouldn't be rebooted by the user (tenant) outside of >>>>> the nova-api. I'm not sure if it's actually formally documented in >>>>>the >>>>> nova documentation, but from what I've always heard/known, nova is >>>>>the >>>>> control plane and you should be doing everything with your instances >>>>>via >>>>> the nova-api. If the user rebooted via nova-api, the task_state >>>>>would >>>>> be set and the periodic task would ignore the instance. >>>> >>>> Matt, this is one case that I showed where the problem occurs. There >>>>are >>>> others and I can invest time to see them. The fact that the periodic >>>>task >>>> is there is important. What I don¹t understand is why having an option >>>>of >>>> log indication for an admin is something that is not useful and >>>>instead >>>>we >>>> are going with having the compute node shutdown instance when this >>>>should >>>> not happen. Our infrastructure is behaving like cattle. That should >>>>not >>>>be >>>> the case and the hypervisor should be the source of truth. >>>> >>>> This is a serious issue and instances in production can and will go >>>>down. >>>> >>>>> >>>>> -- >>>>> >>>>> Thanks, >>>>> >>>>> Matt Riedemann >>>>> >>>>> >>>>> >>>>>______________________________________________________________________ >>>>>__ >>>>>__ >>>>> OpenStack Development Mailing List (not for usage questions) >>>>> Unsubscribe: >>>>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>>> >>>> >>>>_______________________________________________________________________ >>>>__ >>>>_ >>>> OpenStack Development Mailing List (not for usage questions) >>>> Unsubscribe: >>>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >>>> >>> >>>For the HA case #1, the periodic task checks to see if the instance.host >>>doesn't match the compute service host [1] and skips if they don't >>>match. >>> >>>Shouldn't your HA scenario be updating which host the instance is >>>running on? Or is this a vCenter-ism? >> >>The nova compute node has not changed. It is not the compute nodes host. >>The host that the instance was running on was down and those instances >>were moved. > >So this is a case where a single compute node is managing multiple >hypervisors? It sounds like there is an assumption being made in the >periodic task that doesn't hold true for the VMware driver, that a >request for the power state of an instance would fail if the host was >down. This may be a better fix here: to not sync the state if the host >is down. > > > >> >>For libvirt the same issues could happen if a process goes down and is >>restarted (there may be some race conditions). But I am not familiar >>enough with the ins and outs there. Just the fact that suggesting in some >>cases that people disable the periodic task indicates that this too is an >>issue. >> >>But seriously, we need this and the change is non intrusive, configuarble >>and backwards compatible. Honestly I see no reason why this is bing >>blocked. > >The change seems to be under discussion here because this is adding more >complexity to an already quite complex method. I believe the desire is >to find a model that simplifies, or at least doesn't add to the >complexity of, the way that syncs are handled. I am not sure I understand what extra complexity is being added here - the patch in review just logs a message to the log file instead of stopping a running instance. How do you guys suggest that we move forwards with this. At the moment the code is blocked and this is a real problem in deployment. BTW I do not think that this is specific for the Vmware driver - it is just that we hit it first :) > >> >> >>> >>>[1] >>>http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.p >>>y# >>>n5871 >>> >>>-- >>> >>>Thanks, >>> >>>Matt Riedemann >>> >>> >>>________________________________________________________________________ >>>__ >>>OpenStack Development Mailing List (not for usage questions) >>>Unsubscribe: >>>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> >>_________________________________________________________________________ >>_ >>OpenStack Development Mailing List (not for usage questions) >>Unsubscribe: >>openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > >__________________________________________________________________________ >OpenStack Development Mailing List (not for usage questions) >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev