** Changed in: nova Status: Fix Committed => Fix Released ** Changed in: nova Milestone: None => kilo-3
-- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1367186 Title: Instances stuck with task_state of unshelving after RPC call timeout. Status in OpenStack Compute (Nova): Fix Released Bug description: Instances stuck with task_state of unshelving after RPC call between nova-conductor and nova-scheduler fails(because of, for example, timeout) in the operation of unshelve. The environment: Ubuntu 14.04 LTS(64bit) stable/icehouse(2014.1.2) (I could also reproduce it with master(commit:a1fa42f2ad11258f8b9482353e078adcf73ee9c2).) How to reproduce: 1. create a VM instance 2. shelve the VM instance 3. stop nova-scheduler process 4. unshelve the VM instance (The nova-conductor calls the nova-scheduler, but the RPC call times out.) Then the VM instance stucks with task_state of unshelving(See the following). The VM instance still remains stuck even after nova-scheduler process starts again. stack@devstack-icehouse:/opt/devstack$ nova list +--------------------------------------+---------+-------------------+------------+-------------+-------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+---------+-------------------+------------+-------------+-------------------+ | 12e488e8-1df1-479d-866e-51c3117e384b | server1 | SHELVED_OFFLOADED | unshelving | Shutdown | public=10.0.2.194 | +--------------------------------------+---------+-------------------+------------+-------------+-------------------+ nova-conductor.log: --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 2014-09-09 18:18:13.263 13087 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: Timed out waiting for a reply to message ID 934be80a9798443597f355d60fa08e56 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last): 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher incoming.message)) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/conductor/manager.py", line 849, in unshelve_instance 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher instance) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/conductor/manager.py", line 816, in _schedule_instances 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher request_spec, filter_properties) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/nova/nova/scheduler/rpcapi.py", line 103, in select_destinations 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher request_spec=request_spec, filter_properties=filter_properties) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher retry=self.retry) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher timeout=timeout, retry=retry) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 404, in send 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher retry=retry) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 393, in _send 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher result = self._waiter.wait(msg_id, timeout) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 281, in wait 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher reply, ending = self._poll_connection(msg_id, timeout) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 231, in _poll_connection 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher % msg_id) 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher MessagingTimeout: Timed out waiting for a reply to message ID 934be80a9798443597f355d60fa08e56 2014-09-09 18:18:13.263 13087 TRACE oslo.messaging.rpc.dispatcher 2014-09-09 18:18:13.274 13087 ERROR oslo.messaging._drivers.common [-] Returning exception Timed out waiting for a reply to message ID 934be80a9798443597f355d60fa08e56 to caller 2014-09-09 18:18:13.275 13087 ERROR oslo.messaging._drivers.common [-] ['Traceback (most recent call last):\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply\n incoming.message))\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', ' File "/opt/stack/nova/nova/conductor/manager.py", line 849, in unshelve_instance\n instance)\n', ' File "/opt/stack/nova/nova/conductor/manager.py", line 816, in _schedule_instances\n request_spec, filter_properties)\n', ' File "/opt/stack/nova/nova/scheduler/rpcapi.py", line 103, in select_destinations\n request_spec=request_spec, filter_properties=filter_properties)\n', ' File "/ usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call\n retry=self.retry)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send\n timeout=timeout, retry=retry)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 404, in send\n retry=retry)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 393, in _send\n result = self._waiter.wait(msg_id, timeout)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 281, in wait\n reply, ending = self._poll_connection(msg_id, timeout)\n', ' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 231, in _poll_connection\n % msg_id)\n', 'MessagingTimeout: Timed out waiting for a reply to message ID 934be80a9798443597f355d60fa08e56\n'] --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1367186/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp