Reviewed: https://review.openstack.org/185549 Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ec9d5e375e208686d33b9259b039cc009bded42e Submitter: Jenkins Branch: master
commit ec9d5e375e208686d33b9259b039cc009bded42e Author: Ankit Agrawal <[email protected]> Date: Mon Aug 10 16:27:57 2015 +1000 libvirt: Race condition leads to instance in error ImageCacheManager deletes base image while image backend is copying image to the instance path leading instance to go in the error state. Acquired lock before removing image from cache. If libvirt is copying image to the instance path, image cache manager won't be able to remove it until libvirt finishes copying image completely. Closes-Bug: 1256838 Closes-Bug: 1470437 Co-Authored-By: Michael Still <[email protected]> Depends-On: I337ce28e2fc516c91bec61ca3639ebff0029ad49 Change-Id: I376cc951922c338669fdf3f83da83e0d3cea1532 ** Changed in: nova Status: In Progress => Fix Released -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1470437 Title: ImageCacheManager raises Permission denied error on nova compute in race condition Status in OpenStack Compute (nova): Fix Released Bug description: ImageCacheManager raises Permission denied error on nova compute in race condition While creating an instance snapshot nova calls guest.launch method from libvirt driver which changes the base file permissions and updates base file user from openstack to libvirt-qemu (in case of qcow2 image backend). In race condition when ImageCacheManager is trying to update last access time of this base file and guest.launch is called by instance snapshot just before updating the access time, ImageCacheManager raise Permission denied error in nova compute for os.utime(). Steps to reproduce: 1. Configure image_cache_manager_interval=120 in nova.conf and use qcow2 image backend. 2. Add a sleep for 60 sec in _handle_base_image method of libvirt.imagecache just before calling os.utime(). 3. Restart nova services. 4. Create an instance using image. $ nova boot --image 5e1659aa-6d38-44e8-aaa3-4217337436c0 --flavor 1 instance-1 5. Check that instance is in active state. 6. Go to the n-cpu screen and check imagecache manager logs at the point it waits to execute sleep statement added in step #2. 7. Send instance snapshot request when imagecache manger is waiting to execute sleep. $ nova image-create 19c7900b-73d5-4c2e-b129-5e2a6b13f396 instance-1-snap 8. instance snapshot request updates the base file owner to libvirt-qemu by calling guest.launch method from libvirt driver. 9. Now when imagecache manger comes out from sleep and executes os.utime it raise following Permission denied error in nova compute. 2015-07-01 01:51:46.794 ERROR nova.openstack.common.periodic_task [req-a03fa45f-ffb9-48dd-8937-5b0414c6864b None None] Error during ComputeManager._run_image_cache_manager_pass 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task Traceback(most recent call last): 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/openstack/common/periodic_task.py", line 224, in run_periodic_tasks 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task task(self, context) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/compute/manager.py", line 6177, in _run_image_cache_manager_pass 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self.driver.manage_image_cache(context, filtered_instances) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6252, in manage_image_cache 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self.image_cache_manager.update(context, all_instances) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 668, in update 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self._age_and_verify_cached_images(context, all_instances, base_dir) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 598, in _age_and_verify_cached_images 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task self._handle_base_image(img, base_file) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task File "/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 570, in _handle_base_image 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task os.utime(base_file, None) 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task OSError:[Errno 13] Permission denied: '/opt/stack/data/nova/instances/_base/8d2c340dcce68e48a75457b1e91457feed27aef5' 2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task Expected result: guest.launch should not update the base file permissions and owner to libvirt-qemu. Base file owner should remain unchanged. Actual result: Libvirt is updating the base file owner which causes permission issues in nova. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1470437/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : [email protected] Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp

