On 23-Aug-2012, at 1:34 AM, Alena Prokharchyk <alena.prokharc...@citrix.com> wrote:
> On 8/22/12 12:48 PM, "Rohit Yadav" <rohit.ya...@citrix.com> wrote: > >> I had some discussion on this issue, pasting the email and my comments >> are inline: >> >> On 23-Aug-2012, at 12:46 AM, Alena Prokharchyk >> <alena.prokharc...@citrix.com> wrote: >>> >>> Rohit, >>> >>> Your suggestion makes sense, just trying to add more detailed >>> description >>> for the flow. >>> >>> Cleanup=true flow in restartNetwork: >>> ============================================== >>> 1) cleanup network rules >>> 2) shutdown all network elements (destroy in router case) >>> 3) re-implement all the elements (restart/re-create) >>> 4) reapply all the rules >>> >>> When cleanup=false, first 2 steps are eliminated. >>> >>> We should continue disallowing having cleanup=false for Shared networks. > > > My bad, meant "cleanup=true" here > >> >> Why? This may be useful when someone is upgrading CS. For example, they >> have a shared network and the upgrade brings in new systemvm template. >> The Admin may be required to recreate VRs and SystemVMs with the new >> systemvm template, in such a case recreating VRs for shared network would >> make sense. >> To have that, we give admin option to cleanup and restart network. And it >> would be upto the Admin if they choose to cleanup and recreate their >> shared network? > > The only one disadvantage I can think at this point - Shared network > usually has lot more vms that the Isolated, and cleaning up the resources > have a bigger impact as the user vm range is wider. Yes, I agree what you're implying. If there are 1000 VRs, and only one was deleted and needs restarting an Admin won't want to reimplement all the network and would want a way around such a major overhead. We may slightly derail from the proposed solution (which was to keep in sync with advance network, note: In advance network restart network cleanup=false would still reimplement network for running VRs) this way: For cleanup=false; - Check health of running VRs and skip reimplement network - Recreate deleted VRs, start stopped ones (for each Pod with running instances) and reimplement network elements, copy network rules For cleanup=true; // Admin intends to wipe out all VRs - Delete all VRs - Create new VRs, implement network/copy rules Regards, Rohit > If anybody else can > think about any other reasons why we can't have cleanup=true for Shared > networks, please comment. > > -Alena. > >> >>> What I think we can do is, when implement the element on step 3), for >>> Shared networks (btw, you can have those in Advance zone as well) do >>> this: >>> >>> * Basic zone - get all the PODs having user vms Starting/Running user >>> vms. >>> If the virtual router missing in this setup, re-create it. Do if for >>> every >>> Pod having vms and missing the VR. >> >> Yes, this way CS will not waste resource on creating VRs for Pods with no >> hosts in basic zone. >> >>> * Advance zone - check if there are user vms running in the network (not >>> pod this time), and re-create the VR in the network if needed. >> >> +1. >> >> >> Regards, >> Rohit >> >>> >>> Let me know what you think, and please feel free to copy/paste the >>> discussion to the apache mailing thread + corresponding bug. And we'll >>> proceed there. >> >> Thanks: http://bugs.cloudstack.org/browse/CS-16104 >> >>> >>> -Alena. >>> >>> >>>> >>>> Note: CS should check if there are any hosts in all Pods and deploy VR >>>> in >>>> only those which have any host as it's pod based. >>>> >>>> a. Cleanup selected: cleanup=true >>>> - Running VRs: Stops and deletes them, recreates and starts new VRs. >>>> - Stopped VRs: Deletes them, recreates and starts new VRs. >>>> - Deleted VRs: Recreates and starts new VRs. >>>> >>>> b. Cleanup not selected: cleanup=false >>>> - Running VRs: Checks/prepares, proceeds. >>>> - Stopped VRs: Checks/prepares, starts them. >>>> - Deleted VRs: Checks/prepares, starts new VRs. >>>> >>>> If this proposed logic makes sense, in a customer setup in basic zone >>>> the >>>> Admin has option to use cleanup (unchecked in UI/API) to false and >>>> restart, which only recreates deleted VRs and starts stopped VRs. >>>> This will be in-sync with how restart works for advance zone/network >>>> and >>>> how cleanup is used in that use case. >>>> >>>> I'm not sure, if this will cause future problems, but this gives Admin >>>> more control over CS? >>>> >>>> Regards, >>>> Rohit >>> >>> >> >> >> >> >> On 22-Aug-2012, at 4:59 PM, Rohit Yadav <rohit.ya...@citrix.com> wrote: >> >>> Hi, >>> >>> The restart network behaviour is different for advance and basic >>> network. I've opened this improvement ticket: >>> http://bugs.cloudstack.org/browse/CS-16104 >>> >>> In its description, I've mentioned the present behaviour and proposed >>> an improvement for its behaviour on basic network. >>> The main agenda is to allow a way for the Admin to recreate (deleted) >>> VRs. A way around is to create a new instance, but it's not elegant. >>> >>> Please have a look at that and suggest/comment if that makes sense? >>> Thanks. >>> >>> Regards, >>> Rohit >>> >>> >>> On 21-Aug-2012, at 10:27 PM, Chiradeep Vittal >>> <chiradeep.vit...@citrix.com> wrote: >>> >>>> >>>> >>>> On 8/20/12 10:26 PM, "Rohit Yadav" <rohit.ya...@citrix.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> On 21-Aug-2012, at 2:52 AM, Chiradeep Vittal >>>>> <chiradeep.vit...@citrix.com> wrote: >>>>>> On 8/20/12 2:34 AM, "Rohit Yadav" <rohit.ya...@citrix.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> A domr is checked for its existence and if it does not exist it is >>>>>>> created whenever an instance is launched. If I stop or stop and >>>>>>> delete >>>>>>> the domr, HA won't recreate domr. Is it intentional or a bug? >>>>>> >>>>>> Are you doing this from the API? >>>>> >>>>> No, trying to do this from the web app/client. >>>> >>>> The web app uses the API, so that is equivalent. >>>> >>>>> >>>>> >>>>>> If you are then, presumably you as the >>>>>> admin know what you are doing and do not want the VR to automagically >>>>>> recreate? >>>>> >>>>> I agree, but don't we want to provide HA when domr VM crashes? How >>>>> about >>>>> we introduce a force stop (by admin) Boolean to know if this was >>>>> intentional? >>>> >>>> HA does happen if the hypervisor reports that the VR has gone away when >>>> CloudStack thinks that it should be running. >>>> >>>>> >>>>> At least after deletion, when the Admin user restarts the network they >>>>> should be recreated? >>>>> >>>>>>> Further, when I restart the network after either stopping the domr >>>>>>> or >>>>>>> stopping and deleting domr, restarting fails. From the surface the >>>>>>> problem is null is passed instead of the correct Pod, but I'm not >>>>>>> sure. >>>>>> >>>>>> Looks like a bug to me >>>>> >>>>> +1 Filed that here: http://bugs.cloudstack.org/browse/CS-16104 >>>>> >>>>> Regards, >>>>> Rohit >>>> >>> >> >> > >