On 11.08.20 12:19, Julien Grall wrote:
Hi Julien, Stefano
Hi Stefano,
On 11/08/2020 00:34, Stefano Stabellini wrote:
On Mon, 10 Aug 2020, Oleksandr wrote:
On 08.08.20 01:19, Oleksandr wrote:
On 08.08.20 00:50, Stefano Stabellini wrote:
On Fri, 7 Aug 2020, Oleksandr wrote:
On 06.08.20 03:37, Stefano Stabellini wrote:
Hi Stefano
Trying to simulate IO_RETRY handling mechanism (according to model
below) I
continuously get IO_RETRY from try_fwd_ioserv() ...
OK, thanks for the details. My interpretation seems to be correct.
In which case, it looks like xen/arch/arm/io.c:try_fwd_ioserv
should
return IO_RETRY. Then,
xen/arch/arm/traps.c:do_trap_stage2_abort_guest
also needs to handle try_handle_mmio returning IO_RETRY the first
around, and IO_HANDLED when after QEMU does its job.
What should do_trap_stage2_abort_guest do on IO_RETRY? Simply
return
early and let the scheduler do its job? Something like:
enum io_state state = try_handle_mmio(regs, hsr,
gpa);
switch ( state )
{
case IO_ABORT:
goto inject_abt;
case IO_HANDLED:
advance_pc(regs, hsr);
return;
case IO_RETRY:
/* finish later */
return;
case IO_UNHANDLED:
/* IO unhandled, try another way to handle
it. */
break;
default:
ASSERT_UNREACHABLE();
}
Then, xen/arch/arm/ioreq.c:handle_mmio() gets called by
handle_hvm_io_completion() after QEMU completes the emulation.
Today,
handle_mmio just sets the user register with the read value.
But it would be better if it called again the original function
do_trap_stage2_abort_guest to actually retry the original
operation.
This time do_trap_stage2_abort_guest calls try_handle_mmio() and
gets
IO_HANDLED instead of IO_RETRY,
I may miss some important point, but I failed to see why
try_handle_mmio
(try_fwd_ioserv) will return IO_HANDLED instead of IO_RETRY at this
stage.
Or current try_fwd_ioserv() logic needs rework?
I think you should check the ioreq->state in try_fwd_ioserv(), if the
result is ready, then ioreq->state should be STATE_IORESP_READY,
and you
can return IO_HANDLED.
I optimized test patch a bit (now it looks much simpler). I didn't
face any
issues during a quick test.
Both patches get much closer to following the proper state machine,
great! I think this patch is certainly a good improvement. I think the
other patch you sent earlier, slightly larger, is even better. It makes
the following additional changes that would be good to have:
- try_fwd_ioserv returns IO_HANDLED on state == STATE_IORESP_READY
- handle_mmio simply calls do_trap_stage2_abort_guest
I don't think we should call do_trap_stage2_abort_guest() as part of
the completion because:
* The function do_trap_stage2_abort_guest() is using registers
that are not context switched (such as FAR_EL2). I/O handling is split
in two with likely a context switch in the middle. The second part is
the completion (i.e call to handle_mmio()). So the system registers
will be incorrect.
* A big chunk of do_trap_stage2_abort_guest() is not necessary for
the completion. For instance, there is no need to try to translate the
guest virtual address to a guest physical address.
Therefore the version below is probably the best approach.
Indeed, the first version (with calling do_trap_stage2_abort_guest() for
a completion) is a racy. When testing it more heavily I faced an issue
(sometimes) which resulted in DomU got stuck completely.
(XEN) d2v1: vGICD: bad read width 0 r11 offset 0x000f00
I didn't investigate an issue in detail, but I assumed that code in
do_trap_stage2_abort_guest() caused that. This was the main reason why I
decided to optimize an initial patch (and took only advance_pc).
Reading Julien's answer I understand now what could happen.
--
Regards,
Oleksandr Tyshchenko