On Tue, Jan 29, 2013 at 4:36 PM, Marcus Sorensen <shadow...@gmail.com> wrote: > On Mon, Jan 28, 2013 at 5:25 PM, Marcus Sorensen <shadow...@gmail.com> wrote: >> On Mon, Jan 28, 2013 at 5:21 PM, Chiradeep Vittal >> <chiradeep.vit...@citrix.com> wrote: >>> >>> >>> On 1/28/13 12:38 PM, "Marcus Sorensen" <shadow...@gmail.com> wrote: >>> >>>>On Mon, Jan 28, 2013 at 12:44 PM, Marcus Sorensen <shadow...@gmail.com> >>>>wrote: >>>>> On Mon, Jan 28, 2013 at 12:31 PM, Chiradeep Vittal >>>>> <chiradeep.vit...@citrix.com> wrote: >>>>>> Inline >>>>>> >>>>>> On 1/25/13 4:03 PM, "Brian Angus" <blan...@betterservers.com> wrote: >>>>>> >>>>>>>My answers inline... >>>>>>> >>>>>>>On 01/24/2013 06:49 PM, Marcus Sorensen wrote: >>>>>>>> Brian is probably the right guy to answer some of these, but I'll >>>>>>>>chime >>>>>>>>in >>>>>>>> on a few. >>>>>>>> >>>>>>>> On Thu, Jan 24, 2013 at 5:58 PM, Chiradeep Vittal < >>>>>>>> chiradeep.vit...@citrix.com> wrote: >>>>>>>> >>>>>>>>> Thanks for this. >>>>>>>>> A few comments: >>>>>>>>> 1. These mutating operations have to be async and hence the API >>>>>>>>>commands >>>>>>>>> have to inherit from BaseAsyncCmd instead of BaseCmd >>>>>>>>> >>>>>>>> >>>>>>>> We can do that >>>>>>>This change has been pushed to the branch >>>>>> >>>>>> I saw this. We have 2 choices: extend from BaseAsyncCreateCmd or >>>>>> BaseAsyncCmd. I see that you have chosen the latter. If the former: you >>>>>> synchronously create the nic object first and obtain an id (uuid), and >>>>>> then asynchronously attach the nic. In the current implementation both >>>>>>the >>>>>> db entity creation and the attach are in the asynchronous workflow. If >>>>>>the >>>>>> attach fails in the backend, then there will be a nic object in the >>>>>> database theoretically attached to the vm, but not really attached. >>>>>> Choices: >>>>>> A) change the data model to model the fact that the nic is attached or >>>>>>not >>>>>> (and use the BaseAsyncCreateCmd approach) >>>>>> B) ensure that the nic is destroyed when the attach fails. >>>>>> >>>>>> Note that with (B) you have a potential race condition where an API >>>>>>client >>>>>> can retrieve the user vm before the attach has succeeded/failed and >>>>>>then >>>>>> potentially add a static nat rule. >>>>> >>>>> This is an interesting issue. If it fails to attach, we don't >>>>> necessarily want it to back out. In theory we only bother with the >>>>> attach because the VM is running, if it weren't then it would be >>>>> sufficient to just create the database entry. The existing code (e.g. >>>>> deployVirtualMachine) doesn't delete your nics if they fail to apply >>>>> when the VM starts up, correct? >>> >>> No, the VM will fail to start if any of the nics fail to attach/create. So >>> you will never have a VM that claims to have 3 nics attached but in >>> reality has only 2 attached. >>> >>>>>> >>>> >>>>We appreciate your feedback Chiradeep, and helping us to make sure we >>>>account for all scenarios. Do you think we're far enough along on this >>>>to get it into master? We are just finishing up a few tests that will >>>>be added into the branch but I hope we're to they point that any >>>>lingering outliers can be bugfixes post code freeze. >>> >>> If you fix the failing nic case, sure. >>> >> >> Ok, thanks. > > > I've been reviewing this, and I'm still not sure that backing out the > add nic db entry if the plug nic fails is the right way to go. We can > catch the failure and remove it from the database, sure. But consider > these scenarios: > > User runs deployVirtualMachine with parameter startvm=false. We don't > destroy their chosen VM config later when they launch the VM and a nic > or something else fails to provision. > > User runs addNicToVirtualMachine when the VM is off. This essentially > edits the Virtual Machine's config to the same as had the > virtualmachine been deployed with the extra network. Again, if the > plug fails when they turn the VM on, we don't destroy their config. > > All three of these, deployVirtualMachine, addNicToVirtualMachine when > VM is on, and addNicToVirtualMachine when vm is off, rely on > _networkMgr.createNicForVm to validate and apply for a nic. If we > trust that to create a valid config for our VMs, then we shouldn't go > back on that if there's a temporary problem. > > The whole goal of this feature was to be able to edit the VM's network > config post-deploy. Doing it while the VM was running is a bonus, but > not strictly necessary. Would it be better to go back to only allowing > it while it's off (essentially the same thing, we won't know if the > plug works until they attempt to start the VM, at which point we > wouldn't remove it)? Or simply scrap this branch and move on to making > nics their own entity?
Oh, and I should mention that the code in question is also the existing code used to add nics to virtual routers. So I'm a little skiddish about changing it to wipe out configs if the plug fails, as I don't know what effect it would have on the other code that calls it.