On Thu, Mar 07, 2024 at 03:50:15PM +1100, Jonathan Gray wrote: > Thanks for the detailed report. > > smu7_powergate_uvd+0x23 > pp_set_powergating_by_smu+0x15a > amdgpu_dpm_enable_uvd+0xc1 > taskq_thread > > POLARIS10 has UVD 6.3 > > If driver init fails the task gets removed by: > > cancel_delayed_work_sync(&adev->uvd.idle_work); > > uvd_v6_0_hw_fini > amdgpu_device_ip_fini_early > amdgpu_device_fini_hw > amdgpu_driver_unload_kms > amdgpu_driver_load_kms > amdgpu_attachhook > > but your trace must occur before that gets cleaned up > > smu7_powergate_uvd+0x23 is > /sys/dev/pci/drm/amd/pm/powerplay/hwmgr/smu7_clockpowergating.c:118 > > 114 void smu7_powergate_uvd(struct pp_hwmgr *hwmgr, bool bgate) > 115 { > 116 struct smu7_hwmgr *data = (struct smu7_hwmgr > *)(hwmgr->backend); > 117 > 118 data->uvd_power_gated = bgate; > > Try the following revert of > 'drm/amd/pm/smu7: fix a memleak in smu7_hwmgr_backend_init' > https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=linux-6.6.y&id=ae7cbf935b9a1b41f65fe6443e7cd0c401500b20 > > The matching OpenBSD commit was rev 1.9 > date: 2024/01/29 01:51:19; author: jsg; state: Exp; lines: +5 -1; > commitid: cUHNbtd9MymExldJ; > > Index: sys/dev/pci/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c,v > diff -u -p -r1.10 smu7_hwmgr.c > --- sys/dev/pci/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 6 Feb 2024 > 03:55:02 -0000 1.10 > +++ sys/dev/pci/drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c 7 Mar 2024 > 02:43:27 -0000 > @@ -2974,8 +2974,6 @@ static int smu7_hwmgr_backend_init(struc > result = smu7_get_evv_voltages(hwmgr); > if (result) { > pr_info("Get EVV Voltage Failed. Abort Driver > loading!\n"); > - kfree(hwmgr->backend); > - hwmgr->backend = NULL; > return -EINVAL; > } > } else { > @@ -3021,10 +3019,8 @@ static int smu7_hwmgr_backend_init(struc > } > > result = smu7_update_edc_leakage_table(hwmgr); > - if (result) { > - smu7_hwmgr_backend_fini(hwmgr); > + if (result) > return result; > - } > > return 0; > } >
Thank you for your response Jonathon. I will do as you have suggested and inform bugs@ of the result. It will take me a day or so to do that however. Compiling the new kernel on another machine will be ok. Getting it installed on the problem machine will need some reconfiguration using an install75.img USB flash drive's shell which should be staightforward. -- aer