Hi Alex,
The variable clearly isn't visible to the user, so the user can know whether the kernel supports this feature, but not whether the feature is currently active. Perhaps there's no way to avoid races completely, but don't you expect that if we define that certain operations are blocked after an error notification that a user may want some way to poll for whether the block is active after a reset rather than simply calling a blocked interface to probe for it?
Yes, I will use access blocked function, not the variable.
As we've discussed before, the AER notification needs to be relayed to the user without delay, otherwise we only increase the gap where the user might consume bogus data. It also only seems reasonable to modify the behavior of the interfaces (ie. blocking) if the user is notified, which would be through the existing error notifier. We can never depend on a specific behavior from the user, we may be dealing with a malicious user. We already disable interrupts in vfio_pci_disable() simply by calling the ioctl function directly.
Sorry, I want to know where is vfio_pci_disable invoked. I can't find it in ioctl function.
If we simply disable and re-enable interrupts as you propose, how does the user deal with edge triggered interrupts that may have occurred during that period? Are they lost? Should we instead leave the interrupts enabled but skip eventfd_signal() in the interrupts handlers, queuing interrupts for re-delivery after the device is resumed?
Yes, they will lost.
Or does it make more sense to simply disable the interrupts as done in vfio_pci_disable() and define that the user needs to re-establish interrupts before continuing after an error event? Thanks,
If user invoked the vfio_pci_disable by ioctl function. Yes, user should re-establish interrupts before continuing after an error event. Sincerely Zhoujie