Re: [Openstack] [cinder][nova] Issue with live-migration on new Mitaka cloud using FC XIO storage

Remo Mattei Sat, 01 Apr 2017 08:56:51 -0700

Good to hear Mike, 
we did a project where we did tried to use XIO but it was not working well.. 
About time the start to put some time to push some updates.



Remo 

> On Apr 1, 2017, at 07:50, Mike Smith <mism...@overstock.com> wrote:
> 
> Just circling back on this for posterity, in case it helps someone else with 
> a similar issue:
> 
> We found that this issue is a bug in the XIO cinder driver and XIO management 
> server code related to their Glance image caching implementation.  Cinder 
> volumes that were created as snapshots behind the scenes on the XIO via their 
> image caching live-migrate ok, but any volumes that were ‘originals’ fail 
> miserably to migrate.  
> 
> EMC is working through those issues and several more we found.  
> 
> 
> 
>> On Mar 10, 2017, at 12:23 AM, Mike Smith <mism...@overstock.com 
>> <mailto:mism...@overstock.com>> wrote:
>> 
>> Hello 
>> 
>> We have a new Mitaka cloud in which we use fiber-channel storage (via EMC 
>> XtremIO) and Cinder.  Provisioning/deleting of instances works fine, but at 
>> the last stage of a live-migration operation the VM instance is left running 
>> on the new host with no Cinder volume due to failed paths.  I’m not certain 
>> how much of what I’m seeing is due to core Nova and Cinder functionality vs 
>> the specific Cinder driver for XtremIO and I would love some insight on that 
>> and the possible causes of what we’re seeing.   We use this same combination 
>> on our current Kilo based cloud without issue.
>> 
>> Here’s what happens:
>> 
>> - create a VM booted from volume.  In this case, it ended up on 
>> openstack-compute04 and runs succesfully 
>> 
>> - Multipath status looks good:
>> [root@openstack-compute04 <mailto:r...@openstack-compute04.a.pc.ostk.com>] # 
>> multipath -ll
>> 3514f0c5c0860003d dm-2 XtremIO ,XtremApp        
>> size=20G features='0' hwhandler='0' wp=rw
>> `-+- policy='queue-length 0' prio=1 status=active
>>   |- 1:0:0:1  sdb 8:16 active ready running
>>   |- 1:0:1:1  sdc 8:32 active ready running
>>   |- 12:0:0:1 sdd 8:48 active ready running
>>   `- 12:0:1:1 sde 8:64 active ready running
>> 
>> - The /var/lib/scsi_id command (called by nova-rootwrap as we’ll see later) 
>> is able to determine scsi ids for these paths in /dev/disk/by-path:
>> [r...@openstack-compute04.a.pc.ostk.com 
>> <mailto:r...@openstack-compute04.a.pc.ostk.com> by-path] # for i in `ls -1 | 
>> grep lun`; do echo $i; /lib/udev/scsi_id --page 0x83 --whitelisted 
>> /dev/disk/by-path/$i; echo; done
>> pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1
>> 3514f0c5c0860003d
>> 
>> pci-0000:03:00.0-fc-0x514f0c503187c704-lun-1
>> 3514f0c5c0860003d
>> 
>> pci-0000:03:00.1-fc-0x514f0c503187c701-lun-1
>> 3514f0c5c0860003d
>> 
>> pci-0000:03:00.1-fc-0x514f0c503187c705-lun-1
>> 3514f0c5c0860003d
>> 
>> - Now perform live-migration.  In this case, the instance moves to 
>> openstack-compute03:
>> [root@openstack-controller01 
>> <mailto:r...@openstack-controller01.a.pc.ostk.com>] # nova live-migration 
>> 13c82fa9-828c-4289-8bfc-e36e42f79388
>> 
>> This fails.  The VM is left ‘running' on the new target host but has no disk 
>> because all the paths on the target host are failed.  They are properly 
>> removed from the original host.
>> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # 
>> virsh list --all
>>  Id    Name                           State
>> ----------------------------------------------------
>>  1     instance-000000b5              running
>> 
>> - Failed paths also confirmed by multipath output:
>> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # 
>> multipath -ll
>> 3514f0c5c0860003d dm-2 XtremIO ,XtremApp        
>> size=20G features='0' hwhandler='0' wp=rw
>> `-+- policy='queue-length 0' prio=0 status=enabled
>>   |- 1:0:0:1  sdb 8:16 failed faulty running
>>   |- 1:0:1:1  sdc 8:32 failed faulty running
>>   |- 12:0:0:1 sdd 8:48 failed faulty running
>>   `- 12:0:1:1 sde 8:64 failed faulty running
>> 
>> - The error in the nova-compute log of the target host (openstack-compute03 
>> in this case) points to the call made by nova-rootwrap, which receives a bad 
>> exit code when trying to get the scsi_id:
>> 
>> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Command: 
>> sudo nova-rootwrap /etc/nova/rootwrap.conf scsi_id --page 0x83 --whitelisted 
>> /dev/disk/by-path/pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1
>> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Exit code: 1
>> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Stdout: u''
>> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Stderr: u''
>> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher
>> 
>> ...and in fact, running the scsi_id command directly (as was run previously 
>> on the original host) fails to return a scsi ID and returns with a 
>> non-successful “1” error code:
>> 
>> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # 
>> for i in `ls -1 | grep lun`; do echo $i; /lib/udev/scsi_id --page 0x83 
>> --whitelisted /dev/disk/by-path/$i; echo $?; echo; done
>> pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1
>> 1
>> 
>> pci-0000:03:00.0-fc-0x514f0c503187c704-lun-1
>> 1
>> 
>> pci-0000:03:00.1-fc-0x514f0c503187c701-lun-1
>> 1
>> 
>> pci-0000:03:00.1-fc-0x514f0c503187c705-lun-1
>> 1
>> 
>> My assumption is that Nova is expecting those storage paths to be fully 
>> functional at the time it tries to determine the SCSI IDs and it can’t 
>> because the paths are faulty.  I will be reaching out to EMC’s support for 
>> this of course, but I also would like to get the groups thoughts on this.  I 
>> believe the XIO Cinder driver is responsible for making sure the storage 
>> paths are properly presented, but I don’t fully understand the relationship 
>> between what Nova is doing and what the Cinder driver does.  
>> 
>> Any insight would be appreciated!
>> 
>> Mike Smith
>> Lead Cloud Systems Architect
>> Overstock.com <http://overstock.com/>
>> 
>> 
>> 
> 
> !DSPAM:1,58dfc113254321745136234!
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to     : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> 
> !DSPAM:1,58dfc113254321745136234!

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [Openstack] [cinder][nova] Issue with live-migration on new Mitaka cloud using FC XIO storage

Reply via email to