Would prefer “A bigger design fix would be to make management server
asynchronous of agent side answer/response handling”. However, I understand
the volume of changes that requires.

I looked at the PR, and I think that everything is ok there. Of course, I
think we might need some more time to review and think about the possible
outcomes of such changes.

On Fri, May 11, 2018 at 7:55 AM, Rohit Yadav <rohit.ya...@shapeblue.com>
wrote:

> All,
>
>
> Historically, when the agent (kvm, ssvm, cpvm) is disconnected from the
> management server (say due to mgmt server restart etc), the reconnection
> logic waits for any pending tasks/commands to complete before reconnection
> attempts are made. I tried to search git history but could not find a
> reason, can anyone share why we may need this?
>
>
> Based on the reported issue:
>
> https://github.com/apache/cloudstack/issues/2633
>
>
> I've a working patch which removes this limitation:
>
> https://github.com/apache/cloudstack/pull/2638
>
>
> From testing with various combinations of tasks, I found that when that
> happens even if the pending task succeeds it fails to send an Answer to the
> mgmt server, therefore from the control plane's perspective that task is
> still pending/on-going.
>
>
> When the mgmt server comes back online, and the agent finally reconnects
> (pending on how long the pending task took) the executed operation is still
> pending in mgmt server's view and may sometimes require manual cleanups in
> database. By removing the limitation in above PR, at least the agent
> reconnects faster while of the failure/fault behaviours remain the same. A
> bigger design fix would be to make management server asynchronous of agent
> side answer/response handling.
>
>
> - Rohit
>
> <https://cloudstack.apache.org>
>
>
>
> rohit.ya...@shapeblue.com
> www.shapeblue.com
> 53 Chandos Place, Covent Garden, London  WC2N 4HSUK
> @shapeblue
>
>
>
>


-- 
Rafael Weingärtner

Reply via email to