On Fri, 2019-07-05 at 09:50 +0200, Kevin Wolf wrote: > Am 04.07.2019 um 17:16 hat wangjie (P) geschrieben: > > Hi, everybody: > > > > I developed a feature named "I/O hang",my intention is to solve the problem > > like that: > > If the backend storage media of VM disk is far-end storage like IPSAN or > > FCSAN, storage net link will always disconnection and > > make I/O requests return EIO to Guest, and the status of filesystem in Guest > > will be read-only, even the link recovered > > after a while, the status of filesystem in Guest will not recover. > > The standard solution for this is configuring the guest device with > werror=stop,rerror=stop so that the error is not delivered to the guest, > but the VM is stopped. When you run 'cont', the request is then retried. > > > So I developed a feature named "I/O hang" to solve this problem, the > > solution like that: > > when some I/O requests return EIO in backend, "I/O hang" will catch the > > requests in qemu block layer and > > insert the requests to a rehandle queue but not return EIO to Guest, the I/O > > requests in Guest will hang but it does not lead > > Guest filesystem to be read-only, then "I/O hang" will loop to rehandle the > > requests for a period time(ex. 5 second) until the requests > > not return EIO(when backend storage link recovered). > > Letting requests hang without stopping the VM risks the guest running > into timeouts and deciding that its disk is broken. I came to say exactly this. While developing the nvme-mdev I also had this problem and due to assumptions built in the block layer, you can't just let the guest wait forever for a request.
Note that Linux's nvme driver does know how to retry failed requests, including these that timed out if that helps in any way. Best regards, Maxim Levitsky > > As you say your "hang" and retry logic sits in the block layer, what do > you do when you encounter a bdrv_drain() request? > > > In addition to the function as above, "I/O hang" also can sent event to > > libvirt after backend storage status changed. > > > > configure methods: > > 1. "I/O hang" ability can be configured for each disk as a disk attribute. > > 2. "I/O hang" timeout value also can be configured for each disk, when > > storage link not recover in timeout value, > > "I/O hang" will disable rehandle I/O requests and return EIO to Guest. > > > > Are you interested in the feature? I intend to push this feature to qemu > > org, what's your opinion? > > Were you aware of werror/rerror? Before we add another mechanism, we > need to be sure how the features compare, that the new mechanism > provides a significant advantage and that we keep code duplication as > low as possible. > > Kevin >