On Wed, Nov 01, 2017 at 02:07:48PM +0000, Ian Jackson wrote:
> So, investigations (mostly by Roger, and also a bit of archaeology in
> the osstest db by me) have determined:
> * This bug is 100% reproducible on affected hosts.  The repro is
>   to boot the Windows guest, save/restore it, then migrate it,
>   then shut down.  (This is from an IRL conversation with Roger and
>   may not be 100% accurate.  Roger, please correct me.)

Yes, that's correct AFAICT. The affected hosts works fine if windows
is booted and then shut down (without save/restore or migrations

> * Affected hosts differ from unaffected hosts according to cpuid.
>   Roger has repro'd the bug on an unaffected host by masking out
>   certain cpuid bits.  There are 6 implicated bits and he is working
>   to narrow that down.

I'm currently trying to narrow this down and make sure the above is

> * It seems likely that this is therefore a real bug.  Maybe in Xen and
>   perhaps indeed one that should indeed be a release blocker.
> * But this is not a regresson between master and staging.  It affects
>   many osstest branches apparently equally.
> * This test is, effectively, new: before the osstest change
>   "HostDiskRoot: bump to 20G", these jobs would always fail earlier
>   and the affected step would not be run.
> * The passes we got on various osstest branches before were just
>   because those branches hadn't tested on an affected host yet.  As
>   branches test different hosts, they will stick on affected hosts.
> ISTM that this situation would therefore justify a force push.  We
> have established that this bug is very unlikely to be anything to do
> with the commits currently blocked by the failing pushes.

I agree, this is a bug that's always been present (at least in the
tested branches). It's triggered now because the windows tests
have made further progress.

> Furthermore, the test is not intermittent, so a force push will be
> effective in the following sense: we would only get a "spurious" pass,
> resulting in the relevant osstest branch becoming stuck again, if a
> future test was unlucky and got an unaffected host.  That will happen
> infrequently enough.
> So unless anyone objects (and for xen.git#master, with Julien's
> permission), I intend to force push all affected osstest branches when
> the test report shows the only blockage is ws16 and/or win10 tests
> failing the "guest-stop" step.
> Opinions ?

I agree that a force push is justified. This is bug going to be quite
annoying if osstest decides to tests on non-affected hosts, because
then we will get sporadic success flights.

Thanks, Roger.

Xen-devel mailing list

Reply via email to