On Mon, Sep 25, 2023 at 03:53:23PM -0400, John Snow wrote: > Niklas, I'm sorry to lean on you here a little bit - You've been > working on the SATA side of this a bit more often, can you let me know > if you think this patch is safe?
FWIW, I prefer Fiona's patch series which calls blk_aio_cancel() before calling ide_reset(): https://lore.kernel.org/qemu-devel/20230906130922.142845-1-f.eb...@proxmox.com/T/#u Patch 2/2 also adds a qtest which fails before patch 1/2, and passes after patch 1/2. > > I'm not immediately sure what the impact of applying it is, but I have > some questions about it: > > (1) When does ide_dma_cb get invoked when DRQ_STAT is set but the > return code we were passed is not < 0, and > > (2) what's the impact of just *not* executing the end-of-transfer > block here; what happens to the state machine? > > On Thu, Sep 21, 2023 at 12:07 PM Simon Rowe <simon.r...@nutanix.com> wrote: > > > > When an IDE controller is reset, its internal state is being cleared > > before any outstanding I/O is cancelled. If a response to DMA is > > received in this window, the aio callback will incorrectly continue > > with the next part of the transfer (now using sector 0 from > > the cleared controller state). > > Eugh, yikes. It feels like we should fix the cancellation ... I'm > worried that if we've reset the state machine and we need to bail on a > DMA callback that the heuristics we use for that will eventually be > wrong, unless I am mistaken and this is totally safe and reliable for > a reason I don't intuitively see right away. > > Thoughts? Since Simon has a reliable reproducer, and has offered to test Fiona's patch, perhaps we should simply take him up on that offer :) Kind regards, Niklas