Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL

Jan Beulich Mon, 12 Jun 2017 07:58:22 -0700

>>> On 12.06.17 at 16:30, <julien.gr...@arm.com> wrote:
> On 09/06/17 09:19, Jan Beulich wrote:
>>>>> On 07.06.17 at 10:12, <jbeul...@suse.com> wrote:
>>>>>> On 06.06.17 at 21:19, <sstabell...@kernel.org> wrote:
>>>> On Tue, 6 Jun 2017, Jan Beulich wrote:
>>>>>>>> On 06.06.17 at 16:00, <ian.jack...@eu.citrix.com> wrote:
>>>>>> Looking at the serial logs for that and comparing them with 10009,
>>>>>> it's not terribly easy to see what's going on because the kernel
>>>>>> versions are different and so produce different messages about xenbr0
>>>>>> (and I think may have a different bridge port management algorithm).
>>>>>>
>>>>>> But the messages about promiscuous mode seem the same, and of course
>>>>>> promiscuous mode is controlled by userspace, rather than by the kernel
>>>>>> (so should be the same in both).
>>>>>>
>>>>>> However, in the failed test we see extra messages about promis:
>>>>>>
>>>>>>   Jun  5 13:37:08.353656 [ 2191.652079] device vif7.0-emu left 
>>>>>> promiscuous
>>>>>> mode
>>>>>>   ...
>>>>>>   Jun  5 13:37:08.377571 [ 2191.675298] device vif7.0 left promiscuous 
>>>>>> mode
>>>>>
>>>>> Wouldn't those be another result of the guest shutting down /
>>>>> being shut down?
>>>>>
>>>>>> Also, the qemu log for the guest in the failure case says this:
>>>>>>
>>>>>>   Log-dirty command enable
>>>>>>   Log-dirty: no command yet.
>>>>>>   reset requested in cpu_handle_ioreq.
>>>>>
>>>>> So this would seem to call for instrumentation on the qemu side
>>>>> then, as the only path via which this can be initiated is - afaics -
>>>>> qemu_system_reset_request(), which doesn't have very many
>>>>> callers that could possibly be of interest here. Adding Stefano ...
>>>>
>>>> I am pretty sure that those messages come from qemu traditional: "reset
>>>> requested in cpu_handle_ioreq" is not printed by qemu-xen.
>>>
>>> Oh, indeed - I didn't pay attention to this being a *-qemut-*
>>> test. I'm sorry.
>>>
>>>> In any case, the request comes from qemu_system_reset_request, which is
>>>> called by hw/acpi.c:pm_ioport_writew. It looks like the guest OS
>>>> initiated the reset (or resume)?
>>>
>>> Right, this and hw/pckbd.c look to be the only possible
>>> sources. Yet then it's still unclear what makes the guest go
>>> down.
>>
>> So with all of the above in mind I wonder whether we shouldn't
>> revert 933f966bcd then - that debugging code is unlikely to help
>> with any further analysis of the issue, as reaching that code
>> for a dying domain is only a symptom as far as we understand it
>> now, not anywhere near the cause.
> 
> Are you suggesting to revert on Xen 4.9?


Yes, if we revert now, then I'd say on both master and 4.9.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Re: [Xen-devel] [xen-unstable test] 110009: regressions - FAIL

Reply via email to