On 6/18/25 10:51, Peter Zijlstra wrote: > On Tue, Jun 17, 2025 at 09:12:12PM -0500, Mario Limonciello wrote: > >> How about if we reset before the kexec? There is a symbol for drivers to >> use to know they're about to go through kexec to do $THINGS. >> >> Something like this: >> >> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> index 0fc0eeedc6461..2b1216b14d618 100644 >> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >> @@ -34,6 +34,7 @@ >> >> #include <linux/cc_platform.h> >> #include <linux/dynamic_debug.h> >> +#include <linux/kexec.h> >> #include <linux/module.h> >> #include <linux/mmu_notifier.h> >> #include <linux/pm_runtime.h> >> @@ -2544,6 +2545,9 @@ amdgpu_pci_shutdown(struct pci_dev *pdev) >> adev->mp1_state = PP_MP1_STATE_UNLOAD; >> amdgpu_device_ip_suspend(adev); >> adev->mp1_state = PP_MP1_STATE_NONE; >> + >> + if (kexec_in_progress) >> + amdgpu_asic_reset(adev); >> } >> >> static int amdgpu_pmops_prepare(struct device *dev) > > I will throw this in the dev kernel... I'll let you know.
Mhm if the drivers are informed about the kexec then we could also send the unload/reset packet only to the PSP IIRC. That might have a better chance of succeeding than a full ASIC reset. Lijo should know more about that. Regards, Christian.