Reviewed: https://review.opendev.org/c/openstack/nova/+/867832 Committed: https://opendev.org/openstack/nova/commit/aa3e8fef7b949ec3ddb3c4eaa348eb004593d29e Submitter: "Zuul (22348)" Branch: master
commit aa3e8fef7b949ec3ddb3c4eaa348eb004593d29e Author: Pierre-Samuel Le Stang <pierre-samuel.le-st...@corp.ovh.com> Date: Thu Dec 15 18:30:15 2022 +0100 Correctly reset instance task state in rebooting hard When a user ask for a reboot hard of a running instance while nova compute is unavailable (service stopped or host down) it might happens under certain conditions that the instance stays in rebooting_hard task_state after nova-compute start again. This patch aims to fix that. Closes-Bug: #1999674 Change-Id: I170e390fe4e467898a8dc7df6a446f62941d49ff ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1999674 Title: nova compute service does not reset instance with task_state in rebooting_hard Status in OpenStack Compute (nova): Fix Released Bug description: Description =========== When a user ask for a reboot hard of a running instance while nova compute is unavailable (service stopped or host down) it might happens under certain conditions that the instance stays in rebooting_hard task_state after nova-compute start again. The condition to get this issue is to have a rabbitmq message-ttl of messages in queue which is lower than the time needed to get nova compute up again. Steps to reproduce ================== Prerequisites: * Set a low message-ttl (let's say 60 seconds) in your rabbitmq * Have a running instance on a host First case is having a failure on nova-compute service 1/ stop nova compute service on host 2/ ask for a reboot hard: openstack server reboot --hard <instance_id> 3/ wait 60 seconds 4/ start nova compute service 5/ check instance task_state and status Second case is having a failure on the host 1/ hard shutdown the host (let's say a power supply issue) 2/ ask for a reboot hard: openstack server reboot --hard <instance_id> 3/ wait 60 seconds 2/ restart the host 5/ check instance task_state and status Expected result =============== We expect nova compute to be able to reset the state to active as we lost the message, to let the user take some other actions on the instance. Actual result ============= The instance is stuck in rebooting_hard task_state, user is blocked To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1999674/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp