Good to hear Mike, we did a project where we did tried to use XIO but it was not working well.. About time the start to put some time to push some updates.
Remo > On Apr 1, 2017, at 07:50, Mike Smith <mism...@overstock.com> wrote: > > Just circling back on this for posterity, in case it helps someone else with > a similar issue: > > We found that this issue is a bug in the XIO cinder driver and XIO management > server code related to their Glance image caching implementation. Cinder > volumes that were created as snapshots behind the scenes on the XIO via their > image caching live-migrate ok, but any volumes that were ‘originals’ fail > miserably to migrate. > > EMC is working through those issues and several more we found. > > > >> On Mar 10, 2017, at 12:23 AM, Mike Smith <mism...@overstock.com >> <mailto:mism...@overstock.com>> wrote: >> >> Hello >> >> We have a new Mitaka cloud in which we use fiber-channel storage (via EMC >> XtremIO) and Cinder. Provisioning/deleting of instances works fine, but at >> the last stage of a live-migration operation the VM instance is left running >> on the new host with no Cinder volume due to failed paths. I’m not certain >> how much of what I’m seeing is due to core Nova and Cinder functionality vs >> the specific Cinder driver for XtremIO and I would love some insight on that >> and the possible causes of what we’re seeing. We use this same combination >> on our current Kilo based cloud without issue. >> >> Here’s what happens: >> >> - create a VM booted from volume. In this case, it ended up on >> openstack-compute04 and runs succesfully >> >> - Multipath status looks good: >> [root@openstack-compute04 <mailto:r...@openstack-compute04.a.pc.ostk.com>] # >> multipath -ll >> 3514f0c5c0860003d dm-2 XtremIO ,XtremApp >> size=20G features='0' hwhandler='0' wp=rw >> `-+- policy='queue-length 0' prio=1 status=active >> |- 1:0:0:1 sdb 8:16 active ready running >> |- 1:0:1:1 sdc 8:32 active ready running >> |- 12:0:0:1 sdd 8:48 active ready running >> `- 12:0:1:1 sde 8:64 active ready running >> >> - The /var/lib/scsi_id command (called by nova-rootwrap as we’ll see later) >> is able to determine scsi ids for these paths in /dev/disk/by-path: >> [r...@openstack-compute04.a.pc.ostk.com >> <mailto:r...@openstack-compute04.a.pc.ostk.com> by-path] # for i in `ls -1 | >> grep lun`; do echo $i; /lib/udev/scsi_id --page 0x83 --whitelisted >> /dev/disk/by-path/$i; echo; done >> pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1 >> 3514f0c5c0860003d >> >> pci-0000:03:00.0-fc-0x514f0c503187c704-lun-1 >> 3514f0c5c0860003d >> >> pci-0000:03:00.1-fc-0x514f0c503187c701-lun-1 >> 3514f0c5c0860003d >> >> pci-0000:03:00.1-fc-0x514f0c503187c705-lun-1 >> 3514f0c5c0860003d >> >> - Now perform live-migration. In this case, the instance moves to >> openstack-compute03: >> [root@openstack-controller01 >> <mailto:r...@openstack-controller01.a.pc.ostk.com>] # nova live-migration >> 13c82fa9-828c-4289-8bfc-e36e42f79388 >> >> This fails. The VM is left ‘running' on the new target host but has no disk >> because all the paths on the target host are failed. They are properly >> removed from the original host. >> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # >> virsh list --all >> Id Name State >> ---------------------------------------------------- >> 1 instance-000000b5 running >> >> - Failed paths also confirmed by multipath output: >> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # >> multipath -ll >> 3514f0c5c0860003d dm-2 XtremIO ,XtremApp >> size=20G features='0' hwhandler='0' wp=rw >> `-+- policy='queue-length 0' prio=0 status=enabled >> |- 1:0:0:1 sdb 8:16 failed faulty running >> |- 1:0:1:1 sdc 8:32 failed faulty running >> |- 12:0:0:1 sdd 8:48 failed faulty running >> `- 12:0:1:1 sde 8:64 failed faulty running >> >> - The error in the nova-compute log of the target host (openstack-compute03 >> in this case) points to the call made by nova-rootwrap, which receives a bad >> exit code when trying to get the scsi_id: >> >> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Command: >> sudo nova-rootwrap /etc/nova/rootwrap.conf scsi_id --page 0x83 --whitelisted >> /dev/disk/by-path/pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1 >> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Exit code: 1 >> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Stdout: u'' >> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher Stderr: u'' >> 2017-03-09 21:02:07.364 2279 ERROR oslo_messaging.rpc.dispatcher >> >> ...and in fact, running the scsi_id command directly (as was run previously >> on the original host) fails to return a scsi ID and returns with a >> non-successful “1” error code: >> >> [root@openstack-compute03 <mailto:r...@openstack-compute03.a.pc.ostk.com>] # >> for i in `ls -1 | grep lun`; do echo $i; /lib/udev/scsi_id --page 0x83 >> --whitelisted /dev/disk/by-path/$i; echo $?; echo; done >> pci-0000:03:00.0-fc-0x514f0c503187c700-lun-1 >> 1 >> >> pci-0000:03:00.0-fc-0x514f0c503187c704-lun-1 >> 1 >> >> pci-0000:03:00.1-fc-0x514f0c503187c701-lun-1 >> 1 >> >> pci-0000:03:00.1-fc-0x514f0c503187c705-lun-1 >> 1 >> >> My assumption is that Nova is expecting those storage paths to be fully >> functional at the time it tries to determine the SCSI IDs and it can’t >> because the paths are faulty. I will be reaching out to EMC’s support for >> this of course, but I also would like to get the groups thoughts on this. I >> believe the XIO Cinder driver is responsible for making sure the storage >> paths are properly presented, but I don’t fully understand the relationship >> between what Nova is doing and what the Cinder driver does. >> >> Any insight would be appreciated! >> >> Mike Smith >> Lead Cloud Systems Architect >> Overstock.com <http://overstock.com/> >> >> >> > > !DSPAM:1,58dfc113254321745136234! > _______________________________________________ > Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > Post to : openstack@lists.openstack.org > Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack > > !DSPAM:1,58dfc113254321745136234!
_______________________________________________ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openstack@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack