Thanks for the update rene.
Unfortunately I do not have a vmware setup to test.
I will do a code review tomorrow and run some tests on xen to make sure its
not breaking anything else


On శని., సెప్టెంబర్ 26, 2015 at 17:39 అపరాహ్న, Rene Moser <
m...@renemoser.net> wrote:

I discovered the race condition bug related to CLOUDSTACK-8848 while
testing in our lab and daan started a PR
https://github.com/apache/cloudstack/pull/829 for discussion.

But it turned out to be a dead end discussion. Daan and I started a debug
session on Friday a week ago and we discovered the real problem, but it was
unclear how it can be solved. Daan was off from the next day on.

After another discussion with @anshul1886 started at
https://github.com/apache/cloudstack/pull/829#issuecomment-141613687 he
brought me to the solution I created in
https://github.com/apache/cloudstack/pull/885.

The related comment from ashul:

>From code it seems to be getting updated and DB also suggests that.
>It will not be updated if there is no power change for
>MAX_CONSECUTIVE_SAME_STATE_UPDATE_COUNT. But that is to reduce DB
>transactions and will not create issues as it is updated if there is
>change in power state.

This means all the calculation of how to handle a missing power state is
related to an outdated DB record due DB transaction optimization.

My change makes sure if we detected a outdated record, we reset the counter
to make sure we get new state updates.

In the worst case (if the VM is really missing), the handling of missing
state updates is postponed to the next missingStateReport. So to me, this
is really a safe way to fix this issue.

I patched our lab environment, where we discovered the race condition in
the first place and we didn't see the bug happened again.

You can find the logs here https://github.com/apache/cloudstack/pull/885
attached to the PR.

It isn't easy to test it, I learned when to start a VR migration to hit the
race condition. So that is why I write this message to show you I tested it
in real world conditions.

Yours
resmo



-- 
-
Sent from Windows Phone
~Rajani

Reply via email to