On 02/08/2022 16.09, Peter Maydell wrote:
On Tue, 2 Aug 2022 at 14:53, Thomas Huth <th...@redhat.com> wrote:
The XHCI code could enter an endless loop in case the guest points
QEMU to fetch TRBs from invalid memory areas. Fix it by properly
checking the return value of dma_memory_read().
It certainly makes sense to check the return value from
dma_memory_read(), but how can we end up in an infinite loop
if it fails? Either:
(1) there is some combination of data values which allow an
infinite loop, so we can hit those if the DMA access fails and
we use bogus uninitialized data. But then the guest could
maliciously pass us those bad values and create an infinite
loop without a failing DMA access, ie commit 05f43d44e4b
missed a case. If so we need to fix that!
No, as far as I can see, that's not the case.
Or (2) there isn't actually an infinite loop possible here,
and we're just tidying up a case which is C undefined
behaviour but in practice unlikely to have effects any
worse than the guest could provoke anyway (ie running up
to the TRB_LINK_LIMIT and stopping). In this case the
commit message here is a bit misleading.
So from what I understand, this is happening: The guest somehow manages to
trick QEMU in a state where it reads through a set of TRBs where none is
marked in a way that could cause the function to return. QEMU then reads all
memory 'till the end of RAM and then continues to loop through no-mans-land
since the return value of dma_memory_read() is not checked. Since
dma_memory_read() was not able to fetch the memory, the "trb" struct does
not get updated, i.e. continues to contain data that does not cause the
function to return. Thus the while(1) loop in xhci_ring_chain_length()
happily runs through the whole 64-bit address space range, failing to read
memory and thus leaving the loop. Maybe the loop is not really endless if it
wrap-arounds the 64-bit address space at one point in time and finally finds
another chunk of memory in the real RAM that causes the loop to exit, but it
would still certainly take a long time where QEMU is just completely
unresponsive and busy stepping through the no-mans-land.
Does that make sense to you now? If so, I can respin a v2 with an improved
patch description if you like.
Thomas