On 06/29/2015 10:24 AM, Stefan Hajnoczi wrote: > On Fri, Jun 26, 2015 at 02:27:39PM -0400, John Snow wrote: >> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 >> >> >> >> On 06/26/2015 11:35 AM, Stefan Hajnoczi wrote: >>> On Mon, Jun 22, 2015 at 08:21:05PM -0400, John Snow wrote: >>>> Handle NCQ failures for cases where we want to halt the VM on >>>> IO errors. >>>> >>>> Signed-off-by: John Snow <js...@redhat.com> --- >>>> hw/ide/ahci.c | 17 +++++++++++++++-- hw/ide/ahci.h | 1 + >>>> hw/ide/internal.h | 1 + 3 files changed, 17 insertions(+), 2 >>>> deletions(-) >>>> >>>> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c index >>>> 71b5085..a838317 100644 --- a/hw/ide/ahci.c +++ >>>> b/hw/ide/ahci.c @@ -959,13 +959,25 @@ static void ncq_cb(void >>>> *opaque, int ret) return; } >>>> >>>> + ncq_tfs->halt = false; >>> >>> Why does halt need to be cleared here? >>>
I remember my thinking now. I didn't want to leave it dangling after a successful command, so I wanted to clear it on the callback. It's harmless, but it's weird to have it set afterwards and I wanted it to be very crystal clear what it meant when it was set. (With this usage case, 'halt' being set ALWAYS means that there's a command to retry -- you don't have to test other variables like 'used' to tell if it's a left over flag.) I actually want to leave it here, now. >> >> Might make more sense to clear it just on the beginning of every >> command, in execute(). >> >> There's no strong reason here other than "If there's an error and >> it should be set, it'll get reset again pretty soon." It's just a >> default state. >> >> I could move it from process to execute. > > By the way, does ->halt need to be cleared in ahci_reset_port()? > > I'm thinking of a scenario where requests have failed and the port > is reset. We should not try to reissue those commands after > reset. > If I keep it like I have it now (where it clears itself after a successful command), we can clear it alongside the 'used' flag to protect against the case where the guest issues a reset almost simultaneously with an errored read command, where it might be that the NCQ command halts the VM, then we try to reset immediately after boot. (Perilously tangential side-note: what's the default error action if you don't set rerror=stop or werror=stop? the ide code didn't make it particularly clear to me if it was IGNORE or REPORT.)