On Sun, 10 Jul 2016 09:28:41 +0800 Zhou Jie <zhoujie2...@cn.fujitsu.com> wrote:
> Hi Alex, > > > The variable clearly isn't visible to the user, so the user can know > > whether the kernel supports this feature, but not whether the feature > > is currently active. Perhaps there's no way to avoid races completely, > > but don't you expect that if we define that certain operations are > > blocked after an error notification that a user may want some way to > > poll for whether the block is active after a reset rather than simply > > calling a blocked interface to probe for it? > Yes, I will use access blocked function, not the variable. I don't understand what this means. > > As we've discussed before, the AER notification needs to be relayed to > > the user without delay, otherwise we only increase the gap where the > > user might consume bogus data. It also only seems reasonable to modify > > the behavior of the interfaces (ie. blocking) if the user is notified, > > which would be through the existing error notifier. We can never > > depend on a specific behavior from the user, we may be dealing with a > > malicious user. > > > > We already disable interrupts in vfio_pci_disable() simply by calling > > the ioctl function directly. > Sorry, I want to know where is vfio_pci_disable invoked. > I can't find it in ioctl function. You have the code, vfio_pci_disable() is invoked when the vfio device file descriptor is released. It's not in the ioctl, it calls the ioctl as the user would to disable all interrupts on the device. > > If we simply disable and re-enable interrupts as you propose, > > how does the user deal with edge triggered > > interrupts that may have occurred during that period? Are they lost? > > Should we instead leave the interrupts enabled but skip > > eventfd_signal() in the interrupts handlers, queuing interrupts for > > re-delivery after the device is resumed? > Yes, they will lost. Is that acceptable? This is part of the problem I have with silently disabling interrupt delivery via the command register across reset. It seems more non-deterministic than properly disabling interrupts and requiring the user to reinitialize them after error. > > Or does it make more sense to > > simply disable the interrupts as done in vfio_pci_disable() and define > > that the user needs to re-establish interrupts before continuing after > > an error event? Thanks, > If user invoked the vfio_pci_disable by ioctl function. I'm in no way suggesting that a user invoke vfio_pci_disable(), I'm just trying to point out that vfio_pci_disable() already does a teardown of interrupts, similar to what seems to be required here. > Yes, user should re-establish interrupts before > continuing after an error event. So if we define that users should re-establish interrupts after an error event, then what's the point of only doing command register masking of the interrupts and requiring the user to both tear-down the interrupts and re-establish them? Thanks, Alex