On Mon, 04/20 17:13, Cornelia Huck wrote: > On Fri, 17 Apr 2015 15:59:15 +0800 > Fam Zheng <f...@redhat.com> wrote: > > > Currently, virtio code chooses to kill QEMU if the guest passes any invalid > > data with vring. That has drawbacks such as losing unsaved data (e.g. when > > guest user is writing a very long email), or possible denial of service in > > a nested vm use case where virtio device is passed through. > > > > virtio-1 has introduced a new status bit "NEEDS RESET" which could be used > > to > > improve this by communicating the error state between virtio devices and > > drivers. The device notifies guest upon setting the bit, then the guest > > driver > > should detect this bit and report to userspace, or recover the device by > > resetting it. > > > > This series makes necessary changes in virtio core code, based on which > > virtio-blk is converted. Other devices now keep the existing behavior by > > passing in "error_abort". They will be converted in following series. The > > Linux > > driver part will also be worked on. > > > > One concern with this behavior change is that it's now harder to notice the > > actual driver bug that caused the error, as the guest continues to run. To > > address that, we could probably add a new error action option to virtio > > devices, similar to the "read/write werror" in block layer, so the vm > > could be > > paused and the management will get an event in QMP like pvpanic. This work > > can > > be done on top. > > In principle, this looks nice; I'm not sure however how this affects > non-virtio-1 devices. > > If a device is operating in virtio-1 mode, everything is clearly > specified: The guest is notified and if it is aware of the NEEDS_RESET > bit, it can react accordingly. > > But what about legacy devices? Even if they are notified, they don't > know to check for NEEDS_RESET - and I'm not sure if the undefined > behaviour after NEEDS_RESET might lead to bigger trouble than killing > off the guest. >
The device should become unresponsive to VQ output until guest issues a reset through bus commands. Do you have an example of "big trouble" in mind? Fam