> reset_ntl() does what npu2_dev_procedure_reset() does plus more stuff, > there nothing really in npu2_dev_procedure_reset() which reset_ntl() > does not do already from the hardware standpoint. And it did stop HMIs > for me though. > > but ok, what will be sufficient then if not reset_ntl()?
Argh, yes you are correct. Specifically both npu2_dev_procedure_reset() and reset_ntl() contain: /* NTL Reset */ val = npu2_read(ndev->npu, NPU2_NTL_MISC_CFG1(ndev)); val |= PPC_BIT(8) | PPC_BIT(9); npu2_write(ndev->npu, NPU2_NTL_MISC_CFG1(ndev), val); Which should fence the brick. However from what I recall there was more to reliably preventing HMIs than merely fencing the brick. It invovled a sequence of fencing and flushing the cache with dcbf instructions at the right time which is why we also have the FLR. Unfortunately I don't know the precise details, perhaps if we send enough coffee Balbir's way he might be able remind us? - Alistair > > > > > > - Alistair > > > >> > >> > >> > >>> > >>> - Alistair > >>> > >>> On Monday, 15 October 2018 6:17:51 PM AEDT Alexey Kardashevskiy wrote: > >>>> Ping? > >>>> > >>>> > >>>> On 02/10/2018 13:20, Alexey Kardashevskiy wrote: > >>>>> The skiboot firmware has a hot reset handler which fences the NVIDIA > >>>>> V100 > >>>>> GPU RAM on Witherspoons and makes accesses no-op instead of throwing > >>>>> HMIs: > >>>>> https://github.com/open-power/skiboot/commit/fca2b2b839a67 > >>>>> > >>>>> Now we are going to pass V100 via VFIO which most certainly involves > >>>>> KVM guests which are often terminated without getting a chance to > >>>>> offline > >>>>> GPU RAM so we end up with a running machine with misconfigured memory. > >>>>> Accessing this memory produces hardware management interrupts (HMI) > >>>>> which bring the host down. > >>>>> > >>>>> To suppress HMIs, this wires up this hot reset hook to > >>>>> vfio_pci_disable() > >>>>> via pci_disable_device() which switches NPU2 to a safe mode and prevents > >>>>> HMIs. > >>>>> > >>>>> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru> > >>>>> --- > >>>>> Changes: > >>>>> v2: > >>>>> * updated the commit log > >>>>> --- > >>>>> arch/powerpc/platforms/powernv/pci-ioda.c | 10 ++++++++++ > >>>>> 1 file changed, 10 insertions(+) > >>>>> > >>>>> diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>> b/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>> index cde7102..e37b9cc 100644 > >>>>> --- a/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c > >>>>> @@ -3688,6 +3688,15 @@ static void pnv_pci_release_device(struct > >>>>> pci_dev *pdev) > >>>>> pnv_ioda_release_pe(pe); > >>>>> } > >>>>> > >>>>> +static void pnv_npu_disable_device(struct pci_dev *pdev) > >>>>> +{ > >>>>> + struct eeh_dev *edev = pci_dev_to_eeh_dev(pdev); > >>>>> + struct eeh_pe *eehpe = edev ? edev->pe : NULL; > >>>>> + > >>>>> + if (eehpe && eeh_ops && eeh_ops->reset) > >>>>> + eeh_ops->reset(eehpe, EEH_RESET_HOT); > >>>>> +} > >>>>> + > >>>>> static void pnv_pci_ioda_shutdown(struct pci_controller *hose) > >>>>> { > >>>>> struct pnv_phb *phb = hose->private_data; > >>>>> @@ -3732,6 +3741,7 @@ static const struct pci_controller_ops > >>>>> pnv_npu_ioda_controller_ops = { > >>>>> .reset_secondary_bus = pnv_pci_reset_secondary_bus, > >>>>> .dma_set_mask = pnv_npu_dma_set_mask, > >>>>> .shutdown = pnv_pci_ioda_shutdown, > >>>>> + .disable_device = pnv_npu_disable_device, > >>>>> }; > >>>>> > >>>>> static const struct pci_controller_ops > >>>>> pnv_npu_ocapi_ioda_controller_ops = { > >>>>> > >>>> > >>>> > >>> > >>> > >> > >> > > > > > >