On Tue, 19 Nov 2019 13:02:20 +0100 Cornelia Huck <coh...@redhat.com> wrote:
> On Tue, 19 Nov 2019 12:23:40 +0100 > Halil Pasic <pa...@linux.ibm.com> wrote: > > > On Mon, 18 Nov 2019 19:13:34 +0100 > > Cornelia Huck <coh...@redhat.com> wrote: > > > > > > EIO is returned by vfio-ccw mediated device when the backing > > > > host subchannel is not operational anymore. So return cc=3 > > > > back to the guest, rather than returning a unit check. > > > > This way the guest can take appropriate action such as > > > > issue an 'stsch'. > > > > > > Hnm, I'm trying to recall whether that was actually a conscious choice, > > > but I can't quite remember... the change does make sense at a glance, > > > however. > > > > Is EIO returned if and only if the host subchannel/device is not > > operational any more, or are there cases as well? > > Ok, I walked through the kernel code, and it seems -EIO can happen Thanks Connie for having a look. > - when we try to do I/O while in the NOT_OPER or STANDBY states... cc 3 > makes sense in those cases I do understand NOT_OPER, but I'm not sure about STANDBY. Here is what the PoP says about cc 3 for SSCH. """ Condition code 3 is set, and no other action is taken, when the subchannel is not operational for START SUBCHANNEL. A subchannel is not opera- tional for START SUBCHANNEL if the subchannel is not provided in the channel subsystem, has no valid device number associated with it, or is not enabled. """ Are we guaranteed to reflect one of these conditions back? Under what circumstances do we expect that our request will find the device in STANDBY? > - when the cp is not initialized when trying to fetch the orb... which > is an internal vfio-ccw kernel module error So the answer seems to be, no EIO is also used for something else than 'device not operational' in a sense of the s390 IO architecture (cc=3 and stuff). AFAIR the idea was that EIO means something is broken, and we decided to reflect that as an unit check (because the broader device -- the actual device + our pass-through code == device for the guest) is broken. So I think it was a conscious choice. Regards, Halil > > Btw., this patch only changes one of the handlers; I think you have to > change all of start/halt/clear? > > [Might also be good to double-check the handling for the different > instructions.] > > > Is the mapping > > (cc to condition) documented? By the QEMU code I would think that > > we already have ENODEV and EACCESS for 'not operational' -- no idea > > why we need two codes though. > > -ENODEV: device gone > -EACCES: no path operational > > We should be able to distinguish between the two; in the 'no path > operational' case, the device may still be accessible with a different > path mask in the request. >