On Wed, 2018-12-19 at 08:52:13 UTC, Alexey Kardashevskiy wrote:
> The skiboot firmware has a hot reset handler which fences the NVIDIA V100
> GPU RAM on Witherspoons and makes accesses no-op instead of throwing HMIs:
> https://github.com/open-power/skiboot/commit/fca2b2b839a67
> 
> Now we are going to pass V100 via VFIO which most certainly involves
> KVM guests which are often terminated without getting a chance to offline
> GPU RAM so we end up with a running machine with misconfigured memory.
> Accessing this memory produces hardware management interrupts (HMI)
> which bring the host down.
> 
> To suppress HMIs, this wires up this hot reset hook to vfio_pci_disable()
> via pci_disable_device() which switches NPU2 to a safe mode and prevents
> HMIs.
> 
> Signed-off-by: Alexey Kardashevskiy <a...@ozlabs.ru>
> Acked-by: Alistair Popple <alist...@popple.id.au>
> Reviewed-by: David Gibson <da...@gibson.dropbear.id.au>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/ab7032e793f9ad799ca2692046fba5

cheers

Reply via email to