Maybe I should seek a little clarification as to how live migration works in CS with KVM.
Before we do a live migration of VM 1 from Host 1 to Host 2, do we detach all disks from VM1? If so, then we're good to go there. I'm not as clear with HA. If VM 1 goes down because Host 1 crashes, is the attach-volume command invoked as many times as need be (depending on how many volumes need to be attached) when VM 1 is restarted on Host 2? On Fri, Sep 27, 2013 at 12:17 AM, Mike Tutkowski < mike.tutkow...@solidfire.com> wrote: > Let me clarify this line a bit: > > "We get away without this with XenServer and VMware because - as far as I > know - CS delegates HA and live migration to those clusters and they handle > it most likely with some kind of locking protocol on the SR/datastore." > > When I set up a XenServer or a VMware cluster, all nodes in the cluster > have the proper CHAP credentials and can access a shared SR/datastore. > > HA and live migrations are OK here because the cluster controls access to > the VDI on the SR (or VMDK on the datastore) with some kind of locking > protocol, I expect. > > Since KVM isn't really in a cluster outside of the CloudStack world, > CloudStack has to handle these intricacies. > > In my case, I'm just presenting a raw disk to a VM on a KVM host. > > In that case, HA and live migration depend on the storage plug-in being > able to grant and revoke access to the volume for hosts as needed. > > I'd actually rather not even bother with CHAP when using KVM. > > > On Fri, Sep 27, 2013 at 12:06 AM, Mike Tutkowski < > mike.tutkow...@solidfire.com> wrote: > >> Hey Marcus, >> >> I agree that CHAP does not fulfill the same role as fencing. >> >> I think we're going to have trouble with HA and live migrations on KVM if >> the storage plug-in doesn't have a way of knowing when a host wants to >> access a volume and when we want to revoke access to that volume. >> >> We get away without this with XenServer and VMware because - as far as I >> know - CS delegates HA and live migration to those clusters and they handle >> it most likely with some kind of locking protocol on the SR/datastore. >> >> As far as the path field is concerned, I should be able to populate it >> with the IQN of the volume in question. >> >> One problem I do see, however, is in the getPhysicalDisk method. >> >> How are you envisioning I keep track of KVMPhysicalDisks that I create in >> my connect method? >> >> Initially I was thinking I'd just keep them in a map. Storage pool UUID >> to KVMPhysicalDisks. >> >> The problem is, how do I reconstruct that map if the agent is restarted >> (say the host crashes or is restarted). >> >> For storage pools, we get a message (ModifyStoragePoolCommand) from the >> CS MS to tell us about all of the relevant storage pools. With this >> message, I can reconstruct my cache of storage pools. No problem there. >> >> But how will I know which volumes belong to a given storage pool if I >> have to rebuild that map? How will I even know which volumes are in use at >> all? >> >> Thanks >> >> >> On Thu, Sep 26, 2013 at 11:37 PM, Marcus Sorensen <shadow...@gmail.com>wrote: >> >>> On Thu, Sep 26, 2013 at 10:23 PM, Mike Tutkowski >>> <mike.tutkow...@solidfire.com> wrote: >>> > My comments are inline: >>> > >>> > >>> > On Thu, Sep 26, 2013 at 9:10 PM, Marcus Sorensen <shadow...@gmail.com >>> >wrote: >>> > >>> >> Ok, let me digest this a bit. I got the github responses but I'd also >>> >> like to keep it on-list as well. >>> >> >>> >> My initial thoughts are: >>> >> >>> >> 1) I don't think disk format and size are necessary parameters for >>> >> connectPhysicalDisk, as the format can be determined by the adaptor, >>> >> and the size is set during the createAsync call in the plugin. We >>> >> really just need the disk path and the pool. >>> >> >>> > [Mike T.] I agree, format is not needed. The only reason I have size >>> passed >>> > in is because I need to create a KVMPhysicalDisk at the end of the >>> connect >>> > method. Since this KVMPhysicalDisk is (in the code on GitHub) being >>> used to >>> > create our XML to attach the disk, I figured we'd need that size. The >>> > KVMPhysicalDisk I produce from my implementation of getPhysicalDisk is >>> not >>> > as good because I don't know the size of the disk at this point (I >>> don't >>> > keep that information around). The reason I don't keep that info >>> around is >>> > because I don't have a good way to reproduce that info if the KVM host >>> is >>> > rebooted. We get info about storage pools in the form of a >>> > ModifyStoragePoolCommand, but nothing about the volumes inside of the >>> > storage pool. >>> >>> getPhysicalDisk is called in a bunch of places. I'd rely on the >>> connectPhysicalDisk to only make the block device appear on the host, >>> and then getPhysicalDisk to find that block device and fill out things >>> like disk size and path (the real path to the local block device) for >>> passing and creating the disk XML. Trust me, unless things have >>> changed significantly you need to be able to identify a given device >>> as a specific local disk by whatever you are setting the 'path' >>> attribute to be. getPhysicalDisk will be called on your storage pool >>> with simply the path attribute, and via your adaptor with the pool and >>> path. >>> >>> So you may set path as some combination of iqn and target/pool info, >>> or if iqn is enough to identify a unique block device (in >>> /dev/disk/by-id maybe?) on a host then just use that. Path just needs >>> to be something, anything, to identify the disk on the host. In >>> getPhysicalDisk, identify the local block device matching the info, >>> create a new KVMPhysicalDisk with the local path, size, etc, and >>> return it. >>> >>> > >>> >> >>> >> 2) I thought this access group thing you mention are the grantAccess >>> >> and revokeAccess calls in the storage plugin 2.0 design doc. Was that >>> >> not implemented? >>> >> >>> > [Mike T.] Yeah, as I mentioned in an e-mail way back, those methods >>> were >>> > never implemented in 4.2. I think you said you were going to get around >>> > them not being implemented by keeping certain logic that talks to the >>> SAN >>> > in the agent. I don't think we want any SolidFire-specific code in the >>> > agent, however, so I can't go that route. If those methods do not get >>> > implemented in 4.3, then I will need to use CHAP credentials for KVM >>> (just >>> > like I did with XenServer and VMware). >>> >>> I initially figured your StorageAdaptor implementation would be all >>> solidfire specific, just like the mgmt server side plugin is. If it >>> can be generic to all iscsi storage then that's great. I agree that >>> ideally the agent wouldn't be making API calls to your SAN. I don't >>> think it should be necessary given that you're not going to use the >>> ACL route. I'm not sure CHAP fills the same purpose of fencing. >>> >>> > >>> >> >>> >> I see you've added getters/setters for the attach cmd to pass the >>> >> iscsi info you need. Would it perhaps be possible to send a details >>> >> Map<String, String> instead? Then any plugin implementer could attach >>> >> arbitrary data they need. So it might be >>> >> connectPhysicalDisk(StoragePoolType type, String poolUuid, String >>> >> volPath, Map<String, String> details)? I'll have to look and see >>> >> where those cmd. attributes are set, ideally it would be all the way >>> >> back in the plugin to avoid custom code for every adaptor that wants >>> >> to set details. >>> >> >>> > [Mike T.] If I'm not using the volumes.path field for anything, I can >>> stick >>> > the IQN in volumes.path (as well as leaving it in volumes.iscsi_name, >>> which >>> > is used elsewhere). That way we only have to ask for getPath(). >>> >>> Yeah, whatever it is that you'd need to find the right block device >>> should go in the path. If you look through LibvirtComputingResource >>> you'll see stuff like this sprinkled around: >>> >>> KVMPhysicalDisk volume = primaryPool.getPhysicalDisk(cmd >>> .getVolumePath()); >>> >>> >>> or: >>> String volid = cmd.getPath(); >>> KVMPhysicalDisk vol = pool.getPhysicalDisk(volid); >>> >>> or: >>> >>> KVMPhysicalDisk physicalDisk = >>> _storagePoolMgr.getPhysicalDisk( store.getPoolType(), >>> store.getUuid(), >>> data.getPath()); >>> >>> Maybe some of it is short-circuited by the new KVMStorageProcessor, >>> but I'd still implement a working one, and then attachVolume can call >>> getPhysicalDisk after connectPhysicalDisk, even on your adaptor/pool. >>> >>> > >>> >> >>> >> On Thu, Sep 26, 2013 at 7:35 PM, Mike Tutkowski >>> >> <mike.tutkow...@solidfire.com> wrote: >>> >> > Also, if we went the non-CHAP route, before attaching a volume to a >>> VM, >>> >> > we'd have to tell the plug-in to set up a volume access group. >>> >> > >>> >> > When a volume is detached from a VM, we'd have to tell the plug-in >>> to >>> >> > delete the volume access group. >>> >> > >>> >> > >>> >> > On Thu, Sep 26, 2013 at 7:32 PM, Mike Tutkowski < >>> >> > mike.tutkow...@solidfire.com> wrote: >>> >> > >>> >> >> I mention this is my comments on GitHub, as well, but CHAP info is >>> >> >> associated with an account - not a storage pool. >>> >> >> >>> >> >> Ideally we could do without CHAP info entirely if we had a >>> reliable way >>> >> to >>> >> >> tell the storage plug-in that a given host wants to access a given >>> >> volume. >>> >> >> In this case, my storage plug-in could create what we call a Volume >>> >> Access >>> >> >> Group on the SAN. It would essentially say, "The host with IQN <x> >>> can >>> >> >> access the volume with IQN <y> without using CHAP credentials." Of >>> >> course >>> >> >> we'd need a way to revoke this privilege in the event of a live >>> >> migration >>> >> >> of a VM. Right now, I do not believe such a facility is supported >>> with >>> >> the >>> >> >> storage plug-ins. >>> >> >> >>> >> >> >>> >> >> On Thu, Sep 26, 2013 at 4:56 PM, Marcus Sorensen < >>> shadow...@gmail.com >>> >> >wrote: >>> >> >> >>> >> >>> Looking at your code, is the chap info stored with the pool, so we >>> >> >>> could pass the pool to the adaptor? That would be more agnostic, >>> >> >>> anyone implementing a plugin could pull the specifics they need >>> for >>> >> >>> their stuff out of the pool on the adaptor side, rather than >>> creating >>> >> >>> custom signatures. >>> >> >>> >>> >> >>> Also, I think we may want to consider implementing >>> connect/disconnect >>> >> >>> as just dummy methods in LibvirtStorageAdaptor, so we don't have >>> to be >>> >> >>> picky about which adaptors/types in every single place we may >>> want to >>> >> >>> connect/disconnect (in 4.1 there were several, I'm not sure if >>> >> >>> everything goes through this in 4.2). We can just call >>> >> >>> adaptor.connectPhysicalDisk and the adaptor can decide if it >>> needs to >>> >> >>> do anything. >>> >> >>> >>> >> >>> Comments are attached to your commit, I just wanted to echo them >>> here >>> >> >>> on-list. >>> >> >>> >>> >> >>> On Thu, Sep 26, 2013 at 4:32 PM, Mike Tutkowski >>> >> >>> <mike.tutkow...@solidfire.com> wrote: >>> >> >>> > Oh, SnapshotTestWithFakeData is just modified because the code >>> wasn't >>> >> >>> > building until I corrected this. It has nothing really to do >>> with my >>> >> >>> real >>> >> >>> > changes. >>> >> >>> > >>> >> >>> > >>> >> >>> > On Thu, Sep 26, 2013 at 4:31 PM, Mike Tutkowski < >>> >> >>> > mike.tutkow...@solidfire.com> wrote: >>> >> >>> > >>> >> >>> >> Hey Marcus, >>> >> >>> >> >>> >> >>> >> I implemented your recommendations regarding adding connect and >>> >> >>> disconnect >>> >> >>> >> methods. It is not yet checked in (as you know, having trouble >>> with >>> >> my >>> >> >>> KVM >>> >> >>> >> environment), but it is on GitHub here: >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >>> >> >>> https://github.com/mike-tutkowski/incubator-cloudstack/commit/f2897c65689012e6157c0a0c2ed7e5355900c59a >>> >> >>> >> >>> >> >>> >> Please let me know if you have any more comments. >>> >> >>> >> >>> >> >>> >> Thanks! >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On Thu, Sep 26, 2013 at 4:05 PM, Marcus Sorensen < >>> >> shadow...@gmail.com >>> >> >>> >wrote: >>> >> >>> >> >>> >> >>> >>> Mike, everyone, >>> >> >>> >>> As I've mentioned on the board, I'm working on getting our >>> own >>> >> >>> >>> internal KVM storage plugin working on 4.2. In the interest of >>> >> making >>> >> >>> >>> it forward compatible, I just wanted to confirm what you were >>> doing >>> >> >>> >>> with the solidfire plugin as far as attaching your iscsi >>> LUNs. We >>> >> had >>> >> >>> >>> discussed a new connectPhysicalDisk method for the >>> StorageAdaptor >>> >> >>> >>> class, something perhaps like: >>> >> >>> >>> >>> >> >>> >>> public boolean connectPhysicalDisk(String volumeUuid, >>> >> KVMStoragePool >>> >> >>> >>> pool); >>> >> >>> >>> >>> >> >>> >>> then added to KVMStoragePoolManager: >>> >> >>> >>> >>> >> >>> >>> public boolean connectPhysicalDisk(StoragePoolType type, >>> String >>> >> >>> >>> poolUuid, String volPath) { >>> >> >>> >>> StorageAdaptor adaptor = getStorageAdaptor(type); >>> >> >>> >>> KVMStoragePool pool = >>> adaptor.getStoragePool(poolUuid); >>> >> >>> >>> return adaptor.connectPhysicalDisk(volPath, pool); >>> >> >>> >>> } >>> >> >>> >>> >>> >> >>> >>> Something similar to this for disconnect as well. Then in the >>> >> >>> >>> KVMStorageProcessor these can be called as needed for >>> >> attach/detach. >>> >> >>> >>> We can probably stub out one in LibvirtStorageAdaptor so >>> there's no >>> >> >>> >>> need to switch or if/else for pool types, for example in >>> >> >>> >>> KVMStorageProcessor.attachVolume. >>> >> >>> >>> >>> >> >>> >>> I have debated on whether or not it should just be rolled into >>> >> >>> >>> getPhysicalDisk, having it connect the disk if it's not >>> already >>> >> >>> >>> connected. getPhysicalDisk is called a lot, and I'm not sure >>> it >>> >> always >>> >> >>> >>> needs to connect the disk when it does. In past iterations >>> >> >>> >>> getPhysicalDisk has simply spoken to our SAN api and returned >>> the >>> >> disk >>> >> >>> >>> details, nothing more. So it seemed more flexible and >>> granular to >>> >> do >>> >> >>> >>> the connectPhysicalDisk (we have one now in our 4.1 version). >>> >> >>> >>> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> -- >>> >> >>> >> *Mike Tutkowski* >>> >> >>> >> *Senior CloudStack Developer, SolidFire Inc.* >>> >> >>> >> e: mike.tutkow...@solidfire.com >>> >> >>> >> o: 303.746.7302 >>> >> >>> >> Advancing the way the world uses the cloud< >>> >> >>> http://solidfire.com/solution/overview/?video=play> >>> >> >>> >> *™* >>> >> >>> >> >>> >> >>> > >>> >> >>> > >>> >> >>> > >>> >> >>> > -- >>> >> >>> > *Mike Tutkowski* >>> >> >>> > *Senior CloudStack Developer, SolidFire Inc.* >>> >> >>> > e: mike.tutkow...@solidfire.com >>> >> >>> > o: 303.746.7302 >>> >> >>> > Advancing the way the world uses the >>> >> >>> > cloud<http://solidfire.com/solution/overview/?video=play> >>> >> >>> > *™* >>> >> >>> >>> >> >> >>> >> >> >>> >> >> >>> >> >> -- >>> >> >> *Mike Tutkowski* >>> >> >> *Senior CloudStack Developer, SolidFire Inc.* >>> >> >> e: mike.tutkow...@solidfire.com >>> >> >> o: 303.746.7302 >>> >> >> Advancing the way the world uses the cloud< >>> >> http://solidfire.com/solution/overview/?video=play> >>> >> >> *™* >>> >> >> >>> >> > >>> >> > >>> >> > >>> >> > -- >>> >> > *Mike Tutkowski* >>> >> > *Senior CloudStack Developer, SolidFire Inc.* >>> >> > e: mike.tutkow...@solidfire.com >>> >> > o: 303.746.7302 >>> >> > Advancing the way the world uses the >>> >> > cloud<http://solidfire.com/solution/overview/?video=play> >>> >> > *™* >>> >> >>> > >>> > >>> > >>> > -- >>> > *Mike Tutkowski* >>> > *Senior CloudStack Developer, SolidFire Inc.* >>> > e: mike.tutkow...@solidfire.com >>> > o: 303.746.7302 >>> > Advancing the way the world uses the >>> > cloud<http://solidfire.com/solution/overview/?video=play> >>> > *™* >>> >> >> >> >> -- >> *Mike Tutkowski* >> *Senior CloudStack Developer, SolidFire Inc.* >> e: mike.tutkow...@solidfire.com >> o: 303.746.7302 >> Advancing the way the world uses the >> cloud<http://solidfire.com/solution/overview/?video=play> >> *™* >> > > > > -- > *Mike Tutkowski* > *Senior CloudStack Developer, SolidFire Inc.* > e: mike.tutkow...@solidfire.com > o: 303.746.7302 > Advancing the way the world uses the > cloud<http://solidfire.com/solution/overview/?video=play> > *™* > -- *Mike Tutkowski* *Senior CloudStack Developer, SolidFire Inc.* e: mike.tutkow...@solidfire.com o: 303.746.7302 Advancing the way the world uses the cloud<http://solidfire.com/solution/overview/?video=play> *™*