[Openstack] problem with '_heal_instance_info_cache': am i the only one hitting this?

Don Waterloo Mon, 02 Feb 2015 06:39:15 -0800

I entered a bug as https://bugs.launchpad.net/nova/+bug/1413049. My 'patch'
in there is not correct so ignore that :)


What i'm finding is, about once or twice a day, i run into a race condition
where _heal_instance_info_cache() is active, and a new instance is created
@ the same time. The heal ends up overwriting the info cache to [], and
this is never corrected, leading to an instance that is running ok, but
broken in the database.

if you run
mysql -e "select
instances.host,instances.hostname,instances.uuid,instances.user_id from
instance_info_caches,instances where network_info = '[]' and
instances.deleted = 0 and instances.uuid =
instance_info_caches.instance_uuid;" nova

it should return nothing. for me, it shows the broken instances.

And they are indeed broken, they often have multiple interfaces. If the
user does a 'rebuild', then the libvirt xml file ends up with no source
bridges.

I have:
reclaim_instance_interval = 0
heal_instance_info_cache_interval = 20
periodic_interval=10
image_cache_manager_interval=10
running_deleted_instance_poll_interval=10
instance_delete_interval=10
running_deleted_instance_action=reap


set.

Is no one else hitting this? This might be an unusual environment since we
create instances quite dynamically (maybe 500-1000/day, all from heat so
they start a lot all @ once).

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

[Openstack] problem with '_heal_instance_info_cache': am i the only one hitting this?

Reply via email to