Re: [DISCUSS] CloudStack graceful shutdown

Andrija Panic Thu, 05 Apr 2018 06:48:05 -0700

Hi Ilya,

thanks for the feedback - but in "real world", you need to "understand"
that 60min is next to useless timeout for some jobs (if I understand this
specific parameter correctly ?? - job is really canceled, not only job
monitoring is canceled ???) -


My value for the  "job.cancel.threshold.minutes" is 2880 minutes (2 days?)

I can tell you when you have CEPH/NFS (CEPH even "worse" case, since slower
read durign qemu-img convert process...) of 500GB, then imagine snapshot
job will take many hours. Should I mention 1TB volumes (yes, we had
client's like that...)
Than attaching 1TB volume, that was uploaded to ACS (lives originally on
Secondary Storage, and takes time to be copied over to NFS/CEPH) will take
up to few hours.
Then migrating 1TB volume from NFS to CEPH, or CEPH to NFS, also takes
time...etc.

I'm just giving you feedback as "user", admin of the cloud, zero DEV skills
here :) , just to make sure you make practical decisions (and I admit I
might be wrong with my stuff, but just giving you feedback from our public
cloud setup)


Cheers!




On 5 April 2018 at 15:16, Tutkowski, Mike <[email protected]> wrote:

> Wow, there’s been a lot of good details noted from several people on how
> this process works today and how we’d like it to work in the near future.
>
> 1) Any chance this is already documented on the Wiki?
>
> 2) If not, any chance someone would be willing to do so (a flow diagram
> would be particularly useful).
>
> > On Apr 5, 2018, at 3:37 AM, Marc-Aurèle Brothier <[email protected]>
> wrote:
> >
> > Hi all,
> >
> > Good point ilya but as stated by Sergey there's more thing to consider
> > before being able to do a proper shutdown. I augmented my script I gave
> you
> > originally and changed code in CS. What we're doing for our environment
> is
> > as follow:
> >
> > 1. the MGMT looks for a change in the file /etc/lb-agent which contains
> > keywords for HAproxy[2] (ready, maint) so that HA-proxy can disable the
> > mgmt on the keyword "maint" and the mgmt server stops a couple of
> > threads[1] to stop processing async jobs in the queue
> > 2. Looks for the async jobs and wait until there is none to ensure you
> can
> > send the reconnect commands (if jobs are running, a reconnect will result
> > in a failed job since the result will never reach the management server -
> > the agent waits for the current job to be done before reconnecting, and
> > discard the result... rooms for improvement here!)
> > 3. Issue a reconnectHost command to all the hosts connected to the mgmt
> > server so that they reconnect to another one, otherwise the mgmt must be
> up
> > since it is used to forward commands to agents.
> > 4. when all agents are reconnected, we can shutdown the management server
> > and perform the maintenance.
> >
> > One issue remains for me, during the reconnect, the commands that are
> > processed at the same time should be kept in a queue until the agents
> have
> > finished any current jobs and have reconnected. Today the little time
> > window during which the reconnect happens can lead to failed jobs due to
> > the agent not being connected at the right moment.
> >
> > I could push a PR for the change to stop some processing threads based on
> > the content of a file. It's possible also to cancel the drain of the
> > management by simply changing the content of the file back to "ready"
> > again, instead of "maint" [2].
> >
> > [1] AsyncJobMgr-Heartbeat, CapacityChecker, StatsCollector
> > [2] HA proxy documentation on agent checker: https://cbonte.github.io/
> > haproxy-dconv/1.6/configuration.html#5.2-agent-check
> >
> > Regarding your issue on the port blocking, I think it's fair to consider
> > that if you want to shutdown your server at some point, you have to stop
> > serving (some) requests. Here the only way it's to stop serving
> everything.
> > If the API had a REST design, we could reject any POST/PUT/DELETE
> > operations and allow GET ones. I don't know how hard it would be today to
> > only allow listBaseCmd operations to be more friendly with the users.
> >
> > Marco
> >
> >
> > On Thu, Apr 5, 2018 at 2:22 AM, Sergey Levitskiy <[email protected]>
> > wrote:
> >
> >> Now without spellchecking :)
> >>
> >> This is not simple e.g. for VMware. Each management server also acts as
> an
> >> agent proxy so tasks against a particular ESX host will be always
> >> forwarded. That right answer will be to support a native “maintenance
> mode”
> >> for management server. When entered to such mode the management server
> >> should release all agents including SSVM, block/redirect API calls and
> >> login request and finish all async job it originated.
> >>
> >>
> >>
> >> On Apr 4, 2018, at 5:15 PM, Sergey Levitskiy <[email protected]
> <mailto:
> >> [email protected]>> wrote:
> >>
> >> This is not simple e.g. for VMware. Each management server also acts as
> an
> >> agent proxy so tasks against a particular ESX host will be always
> >> forwarded. That right answer will be to a native support for
> “maintenance
> >> mode” for management server. When entered to such mode the management
> >> server should release all agents including save, block/redirect API
> calls
> >> and login request and finish all a sync job it originated.
> >>
> >> Sent from my iPhone
> >>
> >> On Apr 4, 2018, at 3:31 PM, Rafael Weingärtner <
> >> [email protected]<mailto:[email protected]>> wrote:
> >>
> >> Ilya, still regarding the management server that is being shut down
> issue;
> >> if other MSs/or maybe system VMs (I am not sure to know if they are
> able to
> >> do such tasks) can direct/redirect/send new jobs to this management
> server
> >> (the one being shut down), the process might never end because new tasks
> >> are always being created for the management server that we want to shut
> >> down. Is this scenario possible?
> >>
> >> That is why I mentioned blocking the port 8250 for the
> “graceful-shutdown”.
> >>
> >> If this scenario is not possible, then everything s fine.
> >>
> >>
> >> On Wed, Apr 4, 2018 at 7:14 PM, ilya musayev <
> [email protected]
> >> <mailto:[email protected]>>
> >> wrote:
> >>
> >> I'm thinking of using a configuration from
> "job.cancel.threshold.minutes" -
> >> it will be the longest
> >>
> >>    "category": "Advanced",
> >>
> >>    "description": "Time (in minutes) for async-jobs to be forcely
> >> cancelled if it has been in process for long",
> >>
> >>    "name": "job.cancel.threshold.minutes",
> >>
> >>    "value": "60"
> >>
> >>
> >>
> >>
> >> On Wed, Apr 4, 2018 at 1:36 PM, Rafael Weingärtner <
> >> [email protected]<mailto:[email protected]>> wrote:
> >>
> >> Big +1 for this feature; I only have a few doubts.
> >>
> >> * Regarding the tasks/jobs that management servers (MSs) execute; are
> >> these
> >> tasks originate from requests that come to the MS, or is it possible
> that
> >> requests received by one management server to be executed by other? I
> >> mean,
> >> if I execute a request against MS1, will this request always be
> >> executed/threated by MS1, or is it possible that this request is
> executed
> >> by another MS (e.g. MS2)?
> >>
> >> * I would suggest that after we block traffic coming from
> >> 8080/8443/8250(we
> >> will need to block this as well right?), we can log the execution of
> >> tasks.
> >> I mean, something saying, there are XXX tasks (enumerate tasks) still
> >> being
> >> executed, we will wait for them to finish before shutting down.
> >>
> >> * The timeout (60 minutes suggested) could be global settings that we
> can
> >> load before executing the graceful-shutdown.
> >>
> >> On Wed, Apr 4, 2018 at 5:15 PM, ilya musayev <
> >> [email protected]<mailto:[email protected]>
> >>
> >> wrote:
> >>
> >> Use case:
> >> In any environment - time to time - administrator needs to perform a
> >> maintenance. Current stop sequence of cloudstack management server will
> >> ignore the fact that there may be long running async jobs - and
> >> terminate
> >> the process. This in turn can create a poor user experience and
> >> occasional
> >> inconsistency  in cloudstack db.
> >>
> >> This is especially painful in large environments where the user has
> >> thousands of nodes and there is a continuous patching that happens
> >> around
> >> the clock - that requires migration of workload from one node to
> >> another.
> >>
> >> With that said - i've created a script that monitors the async job
> >> queue
> >> for given MS and waits for it complete all jobs. More details are
> >> posted
> >> below.
> >>
> >> I'd like to introduce "graceful-shutdown" into the systemctl/service of
> >> cloudstack-management service.
> >>
> >> The details of how it will work is below:
> >>
> >> Workflow for graceful shutdown:
> >> Using iptables/firewalld - block any connection attempts on 8080/8443
> >> (we
> >> can identify the ports dynamically)
> >> Identify the MSID for the node, using the proper msid - query
> >> async_job
> >> table for
> >> 1) any jobs that are still running (or job_status=“0”)
> >> 2) job_dispatcher not like “pseudoJobDispatcher"
> >> 3) job_init_msid=$my_ms_id
> >>
> >> Monitor this async_job table for 60 minutes - until all async jobs for
> >> MSID
> >> are done, then proceed with shutdown
> >>  If failed for any reason or terminated, catch the exit via trap
> >> command
> >> and unblock the 8080/8443
> >>
> >> Comments are welcome
> >>
> >> Regards,
> >> ilya
> >>
> >>
> >>
> >>
> >> --
> >> Rafael Weingärtner
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Rafael Weingärtner
> >>
>



-- 

Andrija Panić

Re: [DISCUSS] CloudStack graceful shutdown

Reply via email to