>>> On 12.06.17 at 16:30, <julien.gr...@arm.com> wrote: > On 09/06/17 09:19, Jan Beulich wrote: >>>>> On 07.06.17 at 10:12, <jbeul...@suse.com> wrote: >>>>>> On 06.06.17 at 21:19, <sstabell...@kernel.org> wrote: >>>> On Tue, 6 Jun 2017, Jan Beulich wrote: >>>>>>>> On 06.06.17 at 16:00, <ian.jack...@eu.citrix.com> wrote: >>>>>> Looking at the serial logs for that and comparing them with 10009, >>>>>> it's not terribly easy to see what's going on because the kernel >>>>>> versions are different and so produce different messages about xenbr0 >>>>>> (and I think may have a different bridge port management algorithm). >>>>>> >>>>>> But the messages about promiscuous mode seem the same, and of course >>>>>> promiscuous mode is controlled by userspace, rather than by the kernel >>>>>> (so should be the same in both). >>>>>> >>>>>> However, in the failed test we see extra messages about promis: >>>>>> >>>>>> Jun 5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left >>>>>> promiscuous >>>>>> mode >>>>>> ... >>>>>> Jun 5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous >>>>>> mode >>>>> >>>>> Wouldn't those be another result of the guest shutting down / >>>>> being shut down? >>>>> >>>>>> Also, the qemu log for the guest in the failure case says this: >>>>>> >>>>>> Log-dirty command enable >>>>>> Log-dirty: no command yet. >>>>>> reset requested in cpu_handle_ioreq. >>>>> >>>>> So this would seem to call for instrumentation on the qemu side >>>>> then, as the only path via which this can be initiated is - afaics - >>>>> qemu_system_reset_request(), which doesn't have very many >>>>> callers that could possibly be of interest here. Adding Stefano ... >>>> >>>> I am pretty sure that those messages come from qemu traditional: "reset >>>> requested in cpu_handle_ioreq" is not printed by qemu-xen. >>> >>> Oh, indeed - I didn't pay attention to this being a *-qemut-* >>> test. I'm sorry. >>> >>>> In any case, the request comes from qemu_system_reset_request, which is >>>> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS >>>> initiated the reset (or resume)? >>> >>> Right, this and hw/pckbd.c look to be the only possible >>> sources. Yet then it's still unclear what makes the guest go >>> down. >> >> So with all of the above in mind I wonder whether we shouldn't >> revert 933f966bcd then - that debugging code is unlikely to help >> with any further analysis of the issue, as reaching that code >> for a dying domain is only a symptom as far as we understand it >> now, not anywhere near the cause. > > Are you suggesting to revert on Xen 4.9?
Yes, if we revert now, then I'd say on both master and 4.9. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel