[AMD Official Use Only] +Mario
I guess that means the functionality needs to be present in amdgpu for APUs also. Presently, this is taken care by PMC driver for APUs. Thanks, Lijo ________________________________ From: amd-gfx <amd-gfx-boun...@lists.freedesktop.org> on behalf of Andrey Grodzovsky <andrey.grodzov...@amd.com> Sent: Tuesday, March 8, 2022 9:55:03 PM To: Shashank Sharma <contactshashanksha...@gmail.com>; amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org> Cc: Deucher, Alexander <alexander.deuc...@amd.com>; Somalapuram, Amaranath <amaranath.somalapu...@amd.com>; Koenig, Christian <christian.koe...@amd.com>; Sharma, Shashank <shashank.sha...@amd.com> Subject: Re: [PATCH 1/2] drm: Add GPU reset sysfs event On 2022-03-07 11:26, Shashank Sharma wrote: > From: Shashank Sharma <shashank.sha...@amd.com> > > This patch adds a new sysfs event, which will indicate > the userland about a GPU reset, and can also provide > some information like: > - which PID was involved in the GPU reset > - what was the GPU status (using flags) > > This patch also introduces the first flag of the flags > bitmap, which can be appended as and when required. I am reminding again about another important piece of info which you can add here and that is Smart Trace Buffer dump [1]. The buffer size is HW specific but from what I see there is no problem to just amend it as part of envp[] initialization. bellow. The interface to get the buffer is smu_stb_collect_info and usage can be seen from frebugfs interface in smu_stb_debugfs_open [1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.spinics.net%2Flists%2Famd-gfx%2Fmsg70751.html&data=04%7C01%7Clijo.lazar%40amd.com%7C80bc3f07e2d0441d44a108da012036dc%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637823535167679490%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=53l7KlTf%2BICKkZkLVwFh6nRTjkAh%2FDpOat5DRoyKIx0%3D&reserved=0 Andrey > > Cc: Alexandar Deucher <alexander.deuc...@amd.com> > Cc: Christian Koenig <christian.koe...@amd.com> > Signed-off-by: Shashank Sharma <shashank.sha...@amd.com> > --- > drivers/gpu/drm/drm_sysfs.c | 24 ++++++++++++++++++++++++ > include/drm/drm_sysfs.h | 3 +++ > 2 files changed, 27 insertions(+) > > diff --git a/drivers/gpu/drm/drm_sysfs.c b/drivers/gpu/drm/drm_sysfs.c > index 430e00b16eec..52a015161431 100644 > --- a/drivers/gpu/drm/drm_sysfs.c > +++ b/drivers/gpu/drm/drm_sysfs.c > @@ -409,6 +409,30 @@ void drm_sysfs_hotplug_event(struct drm_device *dev) > } > EXPORT_SYMBOL(drm_sysfs_hotplug_event); > > +/** > + * drm_sysfs_reset_event - generate a DRM uevent to indicate GPU reset > + * @dev: DRM device > + * @pid: The process ID involve with the reset > + * @flags: Any other information about the GPU status > + * > + * Send a uevent for the DRM device specified by @dev. This indicates > + * user that a GPU reset has occurred, so that the interested client > + * can take any recovery or profiling measure, when required. > + */ > +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t pid, uint32_t > flags) > +{ > + unsigned char pid_str[21], flags_str[15]; > + unsigned char reset_str[] = "RESET=1"; > + char *envp[] = { reset_str, pid_str, flags_str, NULL }; > + > + DRM_DEBUG("generating reset event\n"); > + > + snprintf(pid_str, ARRAY_SIZE(pid_str), "PID=%lu", pid); > + snprintf(flags_str, ARRAY_SIZE(flags_str), "FLAGS=%u", flags); > + kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE, envp); > +} > +EXPORT_SYMBOL(drm_sysfs_reset_event); > + > /** > * drm_sysfs_connector_hotplug_event - generate a DRM uevent for any > connector > * change > diff --git a/include/drm/drm_sysfs.h b/include/drm/drm_sysfs.h > index 6273cac44e47..63f00fe8054c 100644 > --- a/include/drm/drm_sysfs.h > +++ b/include/drm/drm_sysfs.h > @@ -2,6 +2,8 @@ > #ifndef _DRM_SYSFS_H_ > #define _DRM_SYSFS_H_ > > +#define DRM_GPU_RESET_FLAG_VRAM_VALID (1 << 0) > + > struct drm_device; > struct device; > struct drm_connector; > @@ -11,6 +13,7 @@ int drm_class_device_register(struct device *dev); > void drm_class_device_unregister(struct device *dev); > > void drm_sysfs_hotplug_event(struct drm_device *dev); > +void drm_sysfs_reset_event(struct drm_device *dev, uint64_t pid, uint32_t > reset_flags); > void drm_sysfs_connector_hotplug_event(struct drm_connector *connector); > void drm_sysfs_connector_status_event(struct drm_connector *connector, > struct drm_property *property);