>>> On 07.06.17 at 10:12, <jbeul...@suse.com> wrote: >>>> On 06.06.17 at 21:19, <sstabell...@kernel.org> wrote: >> On Tue, 6 Jun 2017, Jan Beulich wrote: >>> >>> On 06.06.17 at 16:00, <ian.jack...@eu.citrix.com> wrote: >>> > Looking at the serial logs for that and comparing them with 10009, >>> > it's not terribly easy to see what's going on because the kernel >>> > versions are different and so produce different messages about xenbr0 >>> > (and I think may have a different bridge port management algorithm). >>> > >>> > But the messages about promiscuous mode seem the same, and of course >>> > promiscuous mode is controlled by userspace, rather than by the kernel >>> > (so should be the same in both). >>> > >>> > However, in the failed test we see extra messages about promis: >>> > >>> > Jun 5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left >>> > promiscuous >>> > mode >>> > ... >>> > Jun 5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous >>> > mode >>> >>> Wouldn't those be another result of the guest shutting down / >>> being shut down? >>> >>> > Also, the qemu log for the guest in the failure case says this: >>> > >>> > Log-dirty command enable >>> > Log-dirty: no command yet. >>> > reset requested in cpu_handle_ioreq. >>> >>> So this would seem to call for instrumentation on the qemu side >>> then, as the only path via which this can be initiated is - afaics - >>> qemu_system_reset_request(), which doesn't have very many >>> callers that could possibly be of interest here. Adding Stefano ... >> >> I am pretty sure that those messages come from qemu traditional: "reset >> requested in cpu_handle_ioreq" is not printed by qemu-xen. > > Oh, indeed - I didn't pay attention to this being a *-qemut-* > test. I'm sorry. > >> In any case, the request comes from qemu_system_reset_request, which is >> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS >> initiated the reset (or resume)? > > Right, this and hw/pckbd.c look to be the only possible > sources. Yet then it's still unclear what makes the guest go > down.
So with all of the above in mind I wonder whether we shouldn't revert 933f966bcd then - that debugging code is unlikely to help with any further analysis of the issue, as reaching that code for a dying domain is only a symptom as far as we understand it now, not anywhere near the cause. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel