Re: job cancelled because of management server restart

Chip Childers Wed, 04 Sep 2013 06:36:34 -0700

On Tue, Sep 03, 2013 at 10:53:14PM +0000, Kelven Yang wrote:
> This is a design issue that we need to improve in general. However, a
> simple roll back logic does not solve the problem, since abnormal
> terminate can happen at any time, which means it can happen in the middle
> of job cancellation process as well.
> 
> Under current architecture, the cleanup work is handled in VM sync
> process, we allow jobs to cancel or fail at anytime, this design decision
> may leave temporary failures to operations that are currently carried in
> the stopping/crashed management server, VM sync process will do
> self-healing and carry back of the consistency of system data. This design
> choice itself is still acceptable to a certain level, unfortunately, this
> process is buggy in current CloudStack releases. The example Marcus gave
> falls in the category of having bug in re-sync VM in migrating state
> (basically to fail it and allow user to re-issue the command).
> 
> I've refactored the modeling used by VM sync process but wasn't able to
> merge into the main branch for 4.2 release due to concerns from community
> about its late readiness time for architecture changes. Will reiterate the
> merge effort after 4.2 release.


Now would be a good time to consider merging into master...

Re: job cancelled because of management server restart

Reply via email to