Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom

Andrew Cooper Thu, 04 Jun 2020 04:14:25 -0700

On 04/06/2020 08:08, Jan Beulich wrote:
> On 04.06.2020 03:46, Marek Marczykowski-Górecki wrote:
>> Then, we get the main issue:
>>
>>     (XEN) d3v0 handle_pio port 0xb004 read 0x0000
>>     (XEN) d3v0 Weird PIO status 1, port 0xb004 read 0xffff
>>     (XEN) domain_crash called from io.c:178
>>
>> Note, there was no XEN_DOMCTL_destroydomain for domain 3 nor its stubdom
>> yet. But XEN_DMOP_remote_shutdown for domain 3 was called already.
> I'd guess an issue with the shutdown deferral logic. Did you / can
> you check whether XEN_DMOP_remote_shutdown managed to pause all
> CPUs (I assume it didn't, since once they're paused there shouldn't
> be any I/O there anymore, and hence no I/O emulation)?


The vcpu in question is talking to Qemu, so will have v->defer_shutdown
intermittently set, and skip the pause in domain_shutdown()

I presume this lack of pause is to allow the vcpu in question to still
be scheduled to consume the IOREQ reply?  (Its fairly opaque logic with
0 clarifying details).

What *should* happen is that, after consuming the reply, the vcpu should
notice and pause itself, at which point it would yield to the
scheduler.  This is the purpose of vcpu_{start,end}_shutdown_deferral().

Evidentially, this is not happening.

Marek: can you add a BUG() after the weird PIO printing?  That should
confirm whether we're getting into handle_pio() via the
handle_hvm_io_completion() path, or via the vmexit path (at which case,
we're fully re-entering the guest).

I suspect you can drop the debugging of XEN_DOMCTL_destroydomain - I
think its just noise atm.

However, it would be very helpful to see the vcpus which fall into
domain_shutdown()'s "else if ( v->defer_shutdown ) continue;" path.

> Another question though: In 4.13 the log message next to the
> domain_crash() I assume you're hitting is "Weird HVM ioemulation
> status", not "Weird PIO status", and the debugging patch you
> referenced doesn't have any change there. Andrew's recent
> change to master, otoh, doesn't use the word "weird" anymore. I
> can therefore only guess that the value logged is still
> hvmemul_do_pio_buffer()'s return value, i.e. X86EMUL_UNHANDLEABLE.
> Please confirm.

It's the first draft of the patch which I did, before submitting to
xen-devel.  We do have X86EMUL_UNHANDLEABLE at this point, but its not
terribly helpful - there are loads of paths which fail silently with
this error.

~Andrew

Re: handle_pio looping during domain shutdown, with qemu 4.2.0 in stubdom

Reply via email to