On Feb 11 13:24, Keith Busch wrote: > On Thu, Feb 11, 2021 at 12:38:48PM +0900, Minwoo Im wrote: > > On 21-02-11 12:00:11, Keith Busch wrote: > > > But I would prefer to see advanced retry tied to real errors that can be > > > retried, like if we got an EBUSY or EAGAIN errno or something like that. > > > > I have seen a thread [1] about ACRE. Forgive me If I misunderstood this > > thread or missed something after this thread. It looks like CRD field in > > the CQE can be set for any NVMe error state which means it *may* depend on > > the device status. > > Right! Setting CRD values is at the controller's discretion for any > error status as long as the host enables ACRE. > > > And this patch just introduced a internal temporarily error state of > > the controller by returning Command Intrrupted status. > > It's just purely synthetic, though. I was hoping something more natural > could trigger the status. That might not provide the deterministic > scenario you're looking for, though. > > I'm not completely against using QEMU as a development/test vehicle for > corner cases like this, but we are introducing a whole lot of knobs > recently, and you practically need to be a QEMU developer to even find > them. We probably should step up the documentation in the wiki along > with these types of features. >
Understood, I'll make docs/specs/nvme.txt and wiki documentation a priority for 6.0. > > I think, in this stage, we can go with some errors in the middle of the > > AIO (nvme_aio_err()) for advanced retry. Shouldn't AIO errors are > > retry-able and supposed to be retried ? > > Sure, we can assume that receiving an error in the AIO callback means > the lower layers exhausted available recovery mechanisms.
signature.asc
Description: PGP signature