Hi, I set up OpenStack according to Martin's tutorial at hastexo.com today on my development machine inside VirtualBox. As I forgot to change the libvirt_type to qemu and kvm isn't available inside VirtualBox, nova-compute understandably failed to boot the VM I created.
I changed the value in nova.conf (and nova-compute.conf as well) and restarted the nova services, expecting that now everything just boots up correctly, but nova-compute didn't recover at all. Instead, I got this exception in the logs: 2012-03-27 14:59:01 CRITICAL nova [-] Instance instance-00000001 could not be found. (nova): TRACE: Traceback (most recent call last): (nova): TRACE: File "/usr/bin/nova-compute", line 49, in <module> (nova): TRACE: service.wait() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/service.py", line 413, in wait (nova): TRACE: _launcher.wait() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/service.py", line 131, in wait (nova): TRACE: service.wait() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 166, in wait (nova): TRACE: return self._exit_event.wait() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/event.py", line 116, in wait (nova): TRACE: return hubs.get_hub().switch() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 177, in switch (nova): TRACE: return self.greenlet.switch() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/eventlet/greenthread.py", line 192, in main (nova): TRACE: result = function(*args, **kwargs) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/service.py", line 101, in run_server (nova): TRACE: server.start() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/service.py", line 162, in start (nova): TRACE: self.manager.init_host() (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 247, in init_host (nova): TRACE: self.reboot_instance(context, instance['uuid']) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped (nova): TRACE: return f(*args, **kw) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 153, in decorated_function (nova): TRACE: function(self, context, instance_uuid, *args, **kwargs) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 171, in decorated_function (nova): TRACE: return function(self, context, instance_uuid, *args, **kwargs) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 898, in reboot_instance (nova): TRACE: reboot_type) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped (nova): TRACE: return f(*args, **kw) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 753, in reboot (nova): TRACE: if self._soft_reboot(instance): (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 773, in _soft_reboot (nova): TRACE: dom = self._lookup_by_name(instance.name) (nova): TRACE: File "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/connection.py", line 1567, in _lookup_by_name (nova): TRACE: raise exception.InstanceNotFound(instance_id=instance_name) (nova): TRACE: InstanceNotFound: Instance instance-00000001 could not be found. (nova): TRACE: I can only guess, that nova-compute failed somewhere where it isn't expected and left the data regarding this VM in an undefined state. I found no way to recover from this failure. I tried to just "nova delete" the machine, checked a few minutes later using "nova show" and saw: OS-DCF:diskConfig | MANUAL OS-EXT-SRV-ATTR:host | vagrant-precise64 OS-EXT-SRV-ATTR:hypervisor_hostname | None OS-EXT-SRV-ATTR:instance_name | instance-00000001 OS-EXT-STS:power_state | 8 OS-EXT-STS:task_state | deleting OS-EXT-STS:vm_state | active ... That looks right, but the deletion process never finishes. Nothing at all happens in the logs. In "nova list", the instance is still listed as "Status: ACTIVE". I tried to stop nova, delete the instance directory in /var/lib/nova/instances and restart nova, but that didn't help either (same exception). I stopped nova again, deleted the VM from the instances (+ security_group_instance_association and instance_info_caches) table in nova's MySQL DB and restarted nova, but just got this different exception in the logs: 2012-03-27 14:55:11 ERROR nova.rpc.amqp [req-26b4686a-85f4-4566-bb0f-d87e8456b1f2 6b177562cbc1434fade182a45427134d 3a21af5fa5fc470ebe2f2471ff5b49d3] Exception during message handling (nova.rpc.amqp): TRACE: Traceback (most recent call last): (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/rpc/amqp.py", line 252, in _process_data (nova.rpc.amqp): TRACE: rval = node_func(context=ctxt, **node_args) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped (nova.rpc.amqp): TRACE: return f(*args, **kw) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 142, in decorated_function (nova.rpc.amqp): TRACE: locked = self.get_lock(context, instance_uuid) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/exception.py", line 114, in wrapped (nova.rpc.amqp): TRACE: return f(*args, **kw) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 171, in decorated_function (nova.rpc.amqp): TRACE: return function(self, context, instance_uuid, *args, **kwargs) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 1597, in get_lock (nova.rpc.amqp): TRACE: instance_ref = self.db.instance_get_by_uuid(context, instance_uuid) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/db/api.py", line 549, in instance_get_by_uuid (nova.rpc.amqp): TRACE: return IMPL.instance_get_by_uuid(context, uuid) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 120, in wrapper (nova.rpc.amqp): TRACE: return f(*args, **kwargs) (nova.rpc.amqp): TRACE: File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/api.py", line 1345, in instance_get_by_uuid (nova.rpc.amqp): TRACE: raise exception.InstanceNotFound(instance_id=uuid) (nova.rpc.amqp): TRACE: InstanceNotFound: Instance 73e90a02-7cef-4d64-a369-fbbc668ea91c could not be found. Of course, I can just reset the whole DB and try again, as this is a development machine … but shouldn't nova-compute handle this (or any) kind of failure more gracefully? Is there a way to cleanly recover from this situation? Best regards, Philipp _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp