> >>> wouldn't you also need to do that somewhere? Unless the driver > >>> does it at startup? > >> > >> VFIO performs GPU reset so I'd expect the GPUs to flush its caches > >> without any software interactions. Am I hoping for too much here? > > > > Sadly you are. It's not the GPU caches that need flushing, it's the CPU > > caches. This needs to happen as part of the reset sequence, so I guess > > you would need to add it to the VFIO driver. > > Well, ok. Caches need flushing, will look into this but this fencing is > still needed, is not it?
Yes. Although without the flushing I think you may get HMI's on any subsequent driver loads. So from the point of view of what happens on the Skiboot/HW side this looks ok so long as all links on an NPU are assigned to the same guest (as this call resets every link on a given NPU). Acked-by: Alistair Popple <alist...@popple.id.au> > > - Alistair > > > >>> - Alistair > >>> > >>>>> - Alistair > >>>>> > >>>>>>> - Alistair > >>>>>>> > >>>>>>>>> - Alistair > >>>>>>>>> > >>>>>>>>> On Monday, 15 October 2018 6:17:51 PM AEDT Alexey Kardashevskiy > > > > wrote: > >>>>>>>>>> Ping? > >>>>>>>>>> > >>>>>>>>>> On 02/10/2018 13:20, Alexey Kardashevskiy wrote: > >>>>>>>>>>> The skiboot firmware has a hot reset handler which fences the > >>>>>>>>>>> NVIDIA V100 > >>>>>>>>>>> GPU RAM on Witherspoons and makes accesses no-op instead of > >>>>>>>>>>> throwing HMIs: > >>>>>>>>>>> https://github.com/open-power/skiboot/commit/fca2b2b839a67 > >>>>>>>>>>> > >>>>>>>>>>> Now we are going to pass V100 via VFIO which most certainly > >>>>>>>>>>> involves > >>>>>>>>>>> KVM guests which are often terminated without getting a chance > >>>>>>>>>>> to > >>>>>>>>>>> offline > >>>>>>>>>>> GPU RAM so we end up with a running machine with misconfigured > >>>>>>>>>>> memory. > >>>>>>>>>>> Accessing this memory produces hardware management interrupts > >>>>>>>>>>> (HMI) > >>>>>>>>>>> which bring the host down. > >>>>>>>>>>> > >>>>>>>>>>> To suppress HMIs, this wires up this hot reset hook to > >>>>>>>>>>> vfio_pci_disable() > >>>>>>>>>>> via pci_disable_device() which switches NPU2 to a safe mode and > >>>>>>>>>>> prevents > >>>>>>>>>>> HMIs. > >>>>>>>>>>> > >>>>>>>>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >>>>>>>>>>> --- > >>>>>>>>>>> Changes: > >>>>>>>>>>> v2: > >>>>>>>>>>> * updated the commit log > >>>>>>>>>>> --- > >>>>>>>>>>> > >>>>>>>>>>> arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++++++++++ > >>>>>>>>>>> 1 file changed, 10 insertions(+) > >>>>>>>>>>> > >>>>>>>>>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>>>>>>>> b/arch/powerpc/platforms/powernv/pci-ioda.c index > >>>>>>>>>>> cde7102..e37b9cc 100644 > >>>>>>>>>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>>>>>>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>>>>>>>> @@ -3688,6 +3688,15 @@ static void pnv_pci_release_device(struct > >>>>>>>>>>> pci_dev *pdev)>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> pnv_ioda_release_pe(pe); > >>>>>>>>>>> > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> +static void pnv_npu_disable_device(struct pci_dev *pdev) > >>>>>>>>>>> +{ > >>>>>>>>>>> + struct eeh_dev *edev = pci_dev_to_eeh_dev(pdev); > >>>>>>>>>>> + struct eeh_pe *eehpe = edev ? edev->pe : NULL; > >>>>>>>>>>> + > >>>>>>>>>>> + if (eehpe && eeh_ops && eeh_ops->reset) > >>>>>>>>>>> + eeh_ops->reset(eehpe, EEH_RESET_HOT); > >>>>>>>>>>> +} > >>>>>>>>>>> + > >>>>>>>>>>> > >>>>>>>>>>> static void pnv_pci_ioda_shutdown(struct pci_controller *hose) > >>>>>>>>>>> { > >>>>>>>>>>> > >>>>>>>>>>> struct pnv_phb *phb = hose->private_data; > >>>>>>>>>>> > >>>>>>>>>>> @@ -3732,6 +3741,7 @@ static const struct pci_controller_ops > >>>>>>>>>>> pnv_npu_ioda_controller_ops = {>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> .reset_secondary_bus = pnv_pci_reset_secondary_bus, > >>>>>>>>>>> .dma_set_mask = pnv_npu_dma_set_mask, > >>>>>>>>>>> .shutdown = pnv_pci_ioda_shutdown, > >>>>>>>>>>> > >>>>>>>>>>> + .disable_device = pnv_npu_disable_device, > >>>>>>>>>>> > >>>>>>>>>>> }; > >>>>>>>>>>> > >>>>>>>>>>> static const struct pci_controller_ops > >>>>>>>>>>> pnv_npu_ocapi_ioda_controller_ops = {