Hi Alexey,

On Wednesday, 11 July 2018 7:45:10 PM AEST Alexey Kardashevskiy wrote:
> On Thu,  7 Jun 2018 17:06:07 +1000
> Alexey Kardashevskiy <a...@ozlabs.ru> wrote:
> 
> > This brings NPU2 in a safe mode when it does not throw HMI if GPU
> > coherent memory is gone.

It might be helpful if you you could describe the problem and what you are
trying to solve in a bit more depth. Assuming the memory was online how are you
offlining it? If the memory has been online merely fencing/hot-resetting the
NVLink is likely not sufficient as you also need to flush caches prior to taking
the links down.

- Alistair

> > Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
> 
> 
> Anyone, ping?
> 
> 
> > ---
> > 
> > The main aim for this is nvlink2 pass through, helps a lot.
> > 
> > 
> > ---
> >  arch/powerpc/platforms/powernv/pci-ioda.c | 11 +++++++++++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/arch/powerpc/platforms/powernv/pci-ioda.c 
> > b/arch/powerpc/platforms/powernv/pci-ioda.c
> > index 66c2804..29f798c 100644
> > --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> > +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> > @@ -3797,6 +3797,16 @@ static void pnv_pci_release_device(struct pci_dev 
> > *pdev)
> >             pnv_ioda_release_pe(pe);
> >  }
> >  
> > +void pnv_npu_disable_device(struct pci_dev *pdev)
> > +{
> > +   struct eeh_dev *edev = pci_dev_to_eeh_dev(pdev);
> > +   struct eeh_pe *eehpe = edev ? edev->pe : NULL;
> > +
> > +   if (eehpe && eeh_ops && eeh_ops->reset) {
> > +           eeh_ops->reset(eehpe, EEH_RESET_HOT);
> > +   }
> > +}
> > +
> >  static void pnv_pci_ioda_shutdown(struct pci_controller *hose)
> >  {
> >     struct pnv_phb *phb = hose->private_data;
> > @@ -3841,6 +3851,7 @@ static const struct pci_controller_ops 
> > pnv_npu_ioda_controller_ops = {
> >     .reset_secondary_bus    = pnv_pci_reset_secondary_bus,
> >     .dma_set_mask           = pnv_npu_dma_set_mask,
> >     .shutdown               = pnv_pci_ioda_shutdown,
> > +   .disable_device         = pnv_npu_disable_device,
> >  };
> >  
> >  static const struct pci_controller_ops pnv_npu_ocapi_ioda_controller_ops = 
> > {
> 
> 
> 
> --
> Alexey
> 


Reply via email to