Reviewed:  https://review.openstack.org/185549
Committed: 
https://git.openstack.org/cgit/openstack/nova/commit/?id=ec9d5e375e208686d33b9259b039cc009bded42e
Submitter: Jenkins
Branch:    master

commit ec9d5e375e208686d33b9259b039cc009bded42e
Author: Ankit Agrawal <[email protected]>
Date:   Mon Aug 10 16:27:57 2015 +1000

    libvirt: Race condition leads to instance in error
    
    ImageCacheManager deletes base image while image backend is copying
    image to the instance path leading instance to go in the error state.
    
    Acquired lock before removing image from cache. If libvirt is copying
    image to the instance path, image cache manager won't be able to remove
    it until libvirt finishes copying image completely.
    
    Closes-Bug: 1256838
    Closes-Bug: 1470437
    Co-Authored-By: Michael Still <[email protected]>
    Depends-On: I337ce28e2fc516c91bec61ca3639ebff0029ad49
    Change-Id: I376cc951922c338669fdf3f83da83e0d3cea1532


** Changed in: nova
       Status: In Progress => Fix Released

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1470437

Title:
  ImageCacheManager raises Permission denied error on nova compute in
  race condition

Status in OpenStack Compute (nova):
  Fix Released

Bug description:
  ImageCacheManager raises Permission denied error on nova compute in
  race condition

  While creating an instance snapshot nova calls guest.launch method
  from libvirt driver which changes the base file permissions and
  updates base file user from openstack to libvirt-qemu (in case of
  qcow2 image backend). In race condition when ImageCacheManager is
  trying to update last access time of this base file and guest.launch
  is called by instance snapshot just before updating the access time,
  ImageCacheManager raise Permission denied error in nova compute for
  os.utime().

  Steps to reproduce:
  1. Configure image_cache_manager_interval=120 in nova.conf and use qcow2 
image backend.
  2. Add a sleep for 60 sec in _handle_base_image method of libvirt.imagecache 
just before calling os.utime().
  3. Restart nova services.
  4. Create an instance using image.
  $ nova boot --image 5e1659aa-6d38-44e8-aaa3-4217337436c0 --flavor 1 instance-1
  5. Check that instance is in active state.
  6. Go to the n-cpu screen and check imagecache manager logs at the point it 
waits to execute sleep statement added in step #2.
  7. Send instance snapshot request when imagecache manger is waiting to 
execute sleep.
  $ nova image-create 19c7900b-73d5-4c2e-b129-5e2a6b13f396 instance-1-snap
  8. instance snapshot request updates the base file owner to libvirt-qemu by 
calling guest.launch method from libvirt driver.
  9. Now when imagecache manger comes out from sleep and executes os.utime it 
raise following Permission denied error in nova compute.

  2015-07-01 01:51:46.794 ERROR nova.openstack.common.periodic_task 
[req-a03fa45f-ffb9-48dd-8937-5b0414c6864b None None] Error during 
ComputeManager._run_image_cache_manager_pass
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task 
Traceback(most recent call last):
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/openstack/common/periodic_task.py", line 224, in 
run_periodic_tasks
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task     
task(self, context)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/compute/manager.py", line 6177, in 
_run_image_cache_manager_pass
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task     
self.driver.manage_image_cache(context, filtered_instances)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/virt/libvirt/driver.py", line 6252, in manage_image_cache
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task     
self.image_cache_manager.update(context, all_instances)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 668, in update
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task
  self._age_and_verify_cached_images(context, all_instances, base_dir)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 598, in 
_age_and_verify_cached_images
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task     
self._handle_base_image(img, base_file)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task   File 
"/opt/stack/nova/nova/virt/libvirt/imagecache.py", line 570, in 
_handle_base_image
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task     
os.utime(base_file, None)
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task 
OSError:[Errno 13] Permission denied: 
'/opt/stack/data/nova/instances/_base/8d2c340dcce68e48a75457b1e91457feed27aef5'
  2015-07-01 01:51:46.794 TRACE nova.openstack.common.periodic_task

  Expected result: guest.launch should not update the base file
  permissions and owner to libvirt-qemu.  Base file owner should remain
  unchanged.

  Actual result: Libvirt is updating the base file owner which causes
  permission issues in nova.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1470437/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : [email protected]
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to