Re: [PATCH v2 0/7] Fix multiple GPU resets in XGMI hive.

2022-05-17 Thread Christian König
Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky: Problem: During hive reset caused by command timing out on a ring extra resets are generated by triggered by KFD which is unable to accesses registers on the resetting ASIC. Fix: Rework GPU reset to actively stop any pending reset works while anoth

Re: [PATCH v2 2/7] drm/amdgpu: Switch to delayed work from work_struct.

2022-05-17 Thread Christian König
Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky: We need to be able to non blocking cancel pending reset works from within GPU reset. Currently kernel API allows this only for delayed_work and not for work_struct. Switch to delayed work and queue it with delay 0 which is equal to queueing work str

Re: [PATCH v2 1/7] drm/amdgpu: Cache result of last reset at reset domain level.

2022-05-17 Thread Christian König
Am 17.05.22 um 21:20 schrieb Andrey Grodzovsky: Will be read by executors of async reset like debugfs. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1

Re: [Intel-gfx] [V2 3/3] drm/amd/display: Move connector debugfs to drm

2022-05-17 Thread Modem, Bhanuprakash
On Mon-16-05-2022 02:09 pm, Jani Nikula wrote: On Mon, 02 May 2022, Harry Wentland wrote: Both the kernel and IGT series look good to me. I recommend you merge the entire kernel set as one into drm-next. We can pull it into amd-staging-drm-next so as not to break our CI once the IGT patches la

[PATCH v5] drm/amdgpu: Disable ABM when AC mode

2022-05-17 Thread Ryan Lin
Disable ABM feature when the system is running on AC mode to get the more perfect contrast of the display. v2: remove "UPSTREAM" from the subject. v3: adv->pm.ac_power updating by amd gpu_acpi_event_handler. v4: Add the file I lost to fix the build error. v5: Move that function of the setting a

[PATCH 2/2] drm/amdkfd: track unified memory reservation with xnack off

2022-05-17 Thread Alex Sierra
[WHY] Unified memory with xnack off should be tracked, as userptr mappings and legacy allocations do. To avoid oversuscribe system memory when xnack off. [How] Exposing functions reserve_mem_limit and unreserve_mem_limit to SVM API and call them on every prange creation and free. Signed-off-by: Al

[PATCH 1/2] drm/amdgpu: remove acc_size from reserve/unreserve mem

2022-05-17 Thread Alex Sierra
TTM used to track the "acc_size" of all BOs internally. We needed to keep track of it in our memory reservation to avoid TTM running out of memory in its own accounting. However, that "acc_size" accounting has since been removed from TTM. Therefore we don't really need to track it any more. Signed

[linux-next:master] BUILD REGRESSION 47c1c54d1bcd0a69a56b49473bc20f17b70e5242

2022-05-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: 47c1c54d1bcd0a69a56b49473bc20f17b70e5242 Add linux-next specific files for 20220517 Error/Warning reports: https://lore.kernel.org/linux-mm/202204181931.klac6fwo-...@intel.com https

[PATCH 1/2] drm/amdkfd: port cwsr trap handler from dkms branch

2022-05-17 Thread Eric Huang
Most of changes are for debugger feature, and it is to simplify trap handler support for new asics in the future. Signed-off-by: Eric Huang --- .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2527 + .../amd/amdkfd/cwsr_trap_handler_gfx10.asm| 325 ++- .../drm/amd/amdkfd/cws

Re: [PATCH 0/3] Fix issues when unplung monitor under mst scenario

2022-05-17 Thread Lyude Paul
I will try to take a look at this during this week btw On Tue, 2022-05-10 at 17:56 +0800, Wayne Lin wrote: > This patch set is trying to resolve issues observed when unplug monitors > under mst scenario. Revert few commits which cause side effects and seems > no longer needed. And propose a patch

Re: [PATCH 3/3] dmr/amdgpu: add support of tmz for GC 10.3.7

2022-05-17 Thread Alex Deucher
Series is: Reviewed-by: Alex Deucher On Tue, May 17, 2022 at 12:29 PM Sunil Khatri wrote: > > Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set. > > Signed-off-by: Sunil Khatri > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gp

Re: [PATCH 1/2] drm/amdkfd: port cwsr trap handler from dkms branch

2022-05-17 Thread Alex Deucher
On Mon, May 16, 2022 at 3:20 PM Eric Huang wrote: > > It is to simplify trap handler support for new asics in > the future. It would be good to provide a basic overview of what the changes are. Alex > > Signed-off-by: Eric Huang > --- > .../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2527 ++

Re: [PATCH 1/1] drm/amd/pm: fix a potential gpu_metrics_table memory leak

2022-05-17 Thread Alex Deucher
Applied. Thanks! On Tue, May 17, 2022 at 9:13 AM Yuanjun Gong wrote: > > From: Gong Yuanjun > > gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but > not freed in yellow_carp_fini_smc_tables(). > > Signed-off-by: Gong Yuanjun > --- > drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_c

Re: [PATCH v2 7/7] drm/amdgpu: Stop any pending reset if another in progress.

2022-05-17 Thread Felix Kuehling
Am 2022-05-17 um 15:21 schrieb Andrey Grodzovsky: We skip rest requests if another one is already in progress. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++ 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgp

Re: [PATCH 1/1] radeon: fix a possible null pointer dereference

2022-05-17 Thread Alex Deucher
Applied. Thanks! On Tue, May 17, 2022 at 9:13 AM Yuanjun Gong wrote: > > From: Gong Yuanjun > > In radeon_fp_native_mode(), the return value of drm_mode_duplicate() > is assigned to mode, which will lead to a NULL pointer dereference > on failure of drm_mode_duplicate(). Add a check to avoid np

[PATCH] drm/amd/pm: correct the metrics version for SMU 11.0.11/12/13

2022-05-17 Thread Alex Deucher
From: Evan Quan Correct the metrics version used for SMU 11.0.11/12/13. Fixes misreported GPU metrics (e.g., fan speed, etc.) depending on which version of SMU firwmare is loaded. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1925 Signed-off-by: Evan Quan Signed-off-by: Alex Deucher ---

Re: [PATCH] drm/amdgpu: fully disable the queues and doorbeels in gfx_v10 before programing the kiq registers

2022-05-17 Thread Alex Deucher
Applied with a reworked commit message. Thanks, Alex On Tue, May 17, 2022 at 7:24 AM wrote: > > From: Haohui Mai > > Signed-off-by: Haohui Mai > --- > drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +- > 1 file changed, 13 insertions(+), 14 deletions(-) > > diff --git a/

[PATCH v2 6/7] drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover

2022-05-17 Thread Andrey Grodzovsky
We removed the wrapper that was queueing the recover function into reset domain queue who was using this name. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2

[PATCH v2 7/7] drm/amdgpu: Stop any pending reset if another in progress.

2022-05-17 Thread Andrey Grodzovsky
We skip rest requests if another one is already in progress. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 27 ++ 1 file changed, 27 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu

[PATCH v2 5/7] drm/amdgpu: Add delayed work for GPU reset from kfd.

2022-05-17 Thread Andrey Grodzovsky
We need to have a delayed work to cancel this reset if another already in progress. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 15 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 31 ---

[PATCH v2 4/7] drm/amdgpu: Add delayed work for GPU reset from debugfs

2022-05-17 Thread Andrey Grodzovsky
We need to have a delayed work to cancel this reset if another already in progress. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 19 +-- 2 files changed, 19 insertions(+), 2 deletions(-) diff

[PATCH v2 1/7] drm/amdgpu: Cache result of last reset at reset domain level.

2022-05-17 Thread Andrey Grodzovsky
Will be read by executors of async reset like debugfs. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 + 3 files changed, 6 insertions(+), 2 deletions(-)

[PATCH v2 3/7] drm/admgpu: Serialize RAS recovery work directly into reset domain queue.

2022-05-17 Thread Andrey Grodzovsky
Save the extra usless work schedule. Also swith to delayed work. Signed-off-by: Andrey Grodzovsky --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 12 +++- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 2 +- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amd

[PATCH v2 2/7] drm/amdgpu: Switch to delayed work from work_struct.

2022-05-17 Thread Andrey Grodzovsky
We need to be able to non blocking cancel pending reset works from within GPU reset. Currently kernel API allows this only for delayed_work and not for work_struct. Switch to delayed work and queue it with delay 0 which is equal to queueing work struct. Signed-off-by: Andrey Grodzovsky --- drive

[PATCH v2 0/7] Fix multiple GPU resets in XGMI hive.

2022-05-17 Thread Andrey Grodzovsky
Problem: During hive reset caused by command timing out on a ring extra resets are generated by triggered by KFD which is unable to accesses registers on the resetting ASIC. Fix: Rework GPU reset to actively stop any pending reset works while another in progress. v2: Switch from generic list as

Re: [PATCH] drm/amdgpu: Set CP_HQD_PQ_CONTROL_RPTR_BLOCK_SIZE correctly in gfx_v8-v11.

2022-05-17 Thread Alex Deucher
On Tue, May 17, 2022 at 2:06 AM wrote: > > From: Haohui Mai > > Remove the accidental shifts on the values of RPTR_BLOCK_SIZE in gfx_v8-v11. > The bug essentially always programs the corresponding fields to zero > instead of the correct value. The hardware clamps values below 5 to 5. Updated th

Re: [PATCH 1/2] drm/amdgpu: Convert to common fdinfo format v5

2022-05-17 Thread Sharma, Shashank
Please feel free to use: Reviewed-by: Shashank Sharma On 5/17/2022 12:36 PM, Christian König wrote: Convert fdinfo format to one documented in drm-usage-stats.rst. It turned out that the existing implementation was actually completely nonsense. The calculated percentages indeed represented the

Re: [PATCH v2] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Alex Deucher
On Tue, May 17, 2022 at 1:00 PM Mario Limonciello wrote: > > An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC > reset for handling aborted suspend can't work with s2idle. > > This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu: > always reset the asic in suspe

[PATCH v2] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Mario Limonciello
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC reset for handling aborted suspend can't work with s2idle. This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)"). A few other commits have gone on top of the ASIC reset, b

RE: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Limonciello, Mario
[Public] No, it mode2 reset that it uses for failure case. From: Lazar, Lijo Sent: Tuesday, May 17, 2022 11:51 To: Limonciello, Mario ; amd-gfx@lists.freedesktop.org Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle [Public] Ya, second is too lengthy. Better t

Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Lazar, Lijo
[Public] Ya, second is too lengthy. Better to leave it as it is. BTW, is this specific to reset by BACO? BACO entry/exit may take longer (better chance of suspend entry abort by some wake-up source). Thanks, Lijo From: Limonciello, Mario Sent: Tuesday, May 17,

Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Alex Deucher
On Tue, May 17, 2022 at 12:30 PM Limonciello, Mario wrote: > > [Public] > > > > > PM_SUSPEND_TO_IDLE should be under a compile guard > > > > It is actually. All of the amdgpu_acpi_* are. It’s not obvious though > looking at the patch, you need to apply it to notice it. > > > > > It makes sense

RE: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Limonciello, Mario
[Public] > PM_SUSPEND_TO_IDLE should be under a compile guard It is actually. All of the amdgpu_acpi_* are. It's not obvious though looking at the patch, you need to apply it to notice it. > It makes sense to rename to something like amdgpu_need_reset_on_suspend() as > it decides on reset on

[PATCH 3/3] dmr/amdgpu: add support of tmz for GC 10.3.7

2022-05-17 Thread Sunil Khatri
Add support of IP GC 10.3.7 in amdgpu_gmc_tmz_set. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c index 7e55ee61f84c..798c56214a23 100

[PATCH 2/3] drm/amdgpu: change code name to ip version for tmz set

2022-05-17 Thread Sunil Khatri
Use IP version rather then code name of IPs for tmz set. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 27 - 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c b/drivers/gpu/drm/amd/amdgpu/am

[PATCH 1/3] drm/amdgpu: move amdgpu_gmc_tmz_set after ip_version populated

2022-05-17 Thread Sunil Khatri
To enable TMZ feature based on IP version needs adev->ip_version populated but its empty. Move amdgpu_gmc_tmz_set to a place where ip_version is populated. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff

Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Lazar, Lijo
[Public] A couple of things - PM_SUSPEND_TO_IDLE should be under a compile guard It makes sense to rename to something like amdgpu_need_reset_on_suspend() as it decides on reset only for a suspend situation. Thanks, Lijo

[PATCH 14/14] drm/radeon: Register ACPI video backlight when skipping radeon backlight registration

2022-05-17 Thread Hans de Goede
Typically the acpi_video driver will initialize before radeon, which used to cause /sys/class/backlight/acpi_video0 to get registered and then radeon would register its own radeon_bl# device later. After which the drivers/acpi/video_detect.c code unregistered the acpi_video0 device to avoid there b

[PATCH 13/14] drm/amdgpu: Register ACPI video backlight when skipping amdgpu backlight registration

2022-05-17 Thread Hans de Goede
Typically the acpi_video driver will initialize before amdgpu, which used to cause /sys/class/backlight/acpi_video0 to get registered and then amdgpu would register its own amdgpu_bl# device later. After which the drivers/acpi/video_detect.c code unregistered the acpi_video0 device to avoid there b

[PATCH 12/14] drm/nouveau: Register ACPI video backlight when nv_backlight registration fails

2022-05-17 Thread Hans de Goede
Typically the acpi_video driver will initialize before nouveau, which used to cause /sys/class/backlight/acpi_video0 to get registered and then nouveau would register its own nv_backlight device later. After which the drivers/acpi/video_detect.c code unregistered the acpi_video0 device to avoid the

[PATCH 11/14] drm/i915: Call acpi_video_register_backlight()

2022-05-17 Thread Hans de Goede
On machins without an i915 opregion the acpi_video driver immediately probes the ACPI video bus and used to also immediately register acpi_video# backlight devices when supported. Once the drm/kms driver then loaded later and possibly registered a native backlight device then the drivers/acpi/vide

[PATCH 08/14] ACPI: video: Simplify acpi_video_unregister_backlight()

2022-05-17 Thread Hans de Goede
When acpi_video_register() has not run yet the video_bus_head will be empty, so there is no need to check the register_count flag first. Signed-off-by: Hans de Goede --- drivers/acpi/acpi_video.c | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/drivers/acpi/acpi_v

[PATCH 10/14] ACPI: video: Remove code to unregister acpi_video backlight when a native backlight registers

2022-05-17 Thread Hans de Goede
Remove the code to unregister acpi_video backlight devices when a native backlight device gets registered later. Now that the acpi_video backlight device registration is a separate step which runs later, after the drm/kms driver is done setting up its own native backlight device, it is no longer n

[PATCH 09/14] ACPI: video: Make backlight class device registration a separate step

2022-05-17 Thread Hans de Goede
On x86/ACPI boards the acpi_video driver will usually initializing before the kms driver (except i915). This causes /sys/class/backlight/acpi_video0 to show up and then the kms driver registers its own native backlight device after which the drivers/acpi/video_detect.c code unregisters the acpi_vid

[PATCH 06/14] ACPI: video: Drop backlight_device_get_by_type() call from acpi_video_get_backlight_type()

2022-05-17 Thread Hans de Goede
Now that all kms drivers which register native/BACKLIGHT_RAW type backlight devices on x86/ACPI boards call acpi_video_get_backlight_type(true), with the native=true value getting cached, there no longer is a need to call backlight_device_get_by_type(BACKLIGHT_RAW) to see if a native backlight devi

[PATCH 07/14] ACPI: video: Remove acpi_video_bus from list before tearing it down

2022-05-17 Thread Hans de Goede
Move the list_del removing an acpi_video_bus from video_bus_head on teardown to before the teardown is done, to avoid code iterating over the video_bus_head list seeing acpi_video_bus objects on there which are (partly) torn down already. Signed-off-by: Hans de Goede --- drivers/acpi/acpi_video.

[PATCH 04/14] drm/radeon: Don't register backlight when another backlight should be used

2022-05-17 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight device we register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones. Registering 2 backlight

[PATCH 00/14] drm/kms: Stop registering multiple /sys/class/backlight devs for a single display

2022-05-17 Thread Hans de Goede
Hi All, As mentioned in my RFC titled "drm/kms: control display brightness through drm_connector properties": https://lore.kernel.org/dri-devel/0d188965-d809-81b5-74ce-7d30c49fe...@redhat.com/ The first step towards this is to deal with some existing technical debt in backlight handling on x86/AC

[PATCH 03/14] drm/amdgpu: Don't register backlight when another backlight should be used

2022-05-17 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight device we register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones. Registering 2 backlight

[PATCH 05/14] drm/nouveau: Don't register backlight when another backlight should be used

2022-05-17 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight device we register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones. Registering 2 backlight

[PATCH 02/14] drm/i915: Don't register backlight when another backlight should be used

2022-05-17 Thread Hans de Goede
Before this commit when we want userspace to use the acpi_video backlight device we register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones. Registering 2 backlight

[PATCH 01/14] ACPI: video: Add a native function parameter to acpi_video_get_backlight_type()

2022-05-17 Thread Hans de Goede
ATM on x86 laptops where we want userspace to use the acpi_video backlight device we often register both the GPU's native backlight device and acpi_video's firmware acpi_video# backlight device. This relies on userspace preferring firmware type backlight devices over native ones, but registering 2

Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Alex Deucher
On Tue, May 17, 2022 at 10:06 AM Limonciello, Mario wrote: > > [Public] > > > > > -Original Message- > > From: Alex Deucher > > Sent: Tuesday, May 17, 2022 08:43 > > To: Limonciello, Mario > > Cc: amd-gfx list > > Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to

RE: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Limonciello, Mario
[Public] > -Original Message- > From: Alex Deucher > Sent: Tuesday, May 17, 2022 08:43 > To: Limonciello, Mario > Cc: amd-gfx list > Subject: Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to > s2idle > > On Tue, May 17, 2022 at 9:39 AM Mario Limonciello > wrote: > >

Re: [PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Alex Deucher
On Tue, May 17, 2022 at 9:39 AM Mario Limonciello wrote: > > An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC > reset for handling aborted suspend can't work with s2idle. > > This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu: > always reset the asic in suspe

[PATCH] drm/amd: Don't reset dGPUs if the system is going to s2idle

2022-05-17 Thread Mario Limonciello
An A+A configuration on ASUS ROG Strix G513QY proves that the ASIC reset for handling aborted suspend can't work with s2idle. This functionality was introduced in commit daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)"). A few other commits have gone on top of the ASIC reset, b

[PATCH 1/1] radeon: fix a possible null pointer dereference

2022-05-17 Thread Yuanjun Gong
From: Gong Yuanjun In radeon_fp_native_mode(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd. The failure status of drm_cvt_mode() on the other path is checked too. Signed-

Re: [PATCH 1/1] drm/amd/pm: fix a potential gpu_metrics_table memory leak

2022-05-17 Thread Greg KH
On Tue, May 17, 2022 at 05:57:46PM +0800, Yuanjun Gong wrote: > From: Gong Yuanjun > > gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but > not freed in yellow_carp_fini_smc_tables(). > > Signed-off-by: Gong Yuanjun > --- > drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c

[PATCH 1/1] drm/amd/pm: fix a potential gpu_metrics_table memory leak

2022-05-17 Thread Yuanjun Gong
From: Gong Yuanjun gpu_metrics_table is allocated in yellow_carp_init_smc_tables() but not freed in yellow_carp_fini_smc_tables(). Signed-off-by: Gong Yuanjun --- drivers/gpu/drm/amd/pm/swsmu/smu13/yellow_carp_ppt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/

Re: [PATCH 1/1] radeon: fix a possible null pointer dereference

2022-05-17 Thread Greg KH
On Tue, May 17, 2022 at 05:57:00PM +0800, Yuanjun Gong wrote: > From: Gong Yuanjun > > In radeon_fp_native_mode(), the return value of drm_mode_duplicate() > is assigned to mode, which will lead to a NULL pointer dereference > on failure of drm_mode_duplicate(). Add a check to avoid npd. > > The

[PATCH] drm/amdgpu: fully disable the queues and doorbeels in gfx_v10 before programing the kiq registers

2022-05-17 Thread ricetons
From: Haohui Mai Signed-off-by: Haohui Mai --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 27 +- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c index dd8f4344eeb8..9a1b42cc850

[PATCH 2/2] drm/amdgpu: add drm-client-id to fdinfo v2

2022-05-17 Thread Christian König
This is enough to get gputop working :) v2: rebase and some addition cleanup Signed-off-by: Christian König Reviewed-by: Shashank Sharma (v1) --- drivers/gpu/drm/amd/amdgpu/amdgpu_fdinfo.c | 15 +++ 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/amd/a

[PATCH 1/2] drm/amdgpu: Convert to common fdinfo format v5

2022-05-17 Thread Christian König
Convert fdinfo format to one documented in drm-usage-stats.rst. It turned out that the existing implementation was actually completely nonsense. The calculated percentages indeed represented the usage of the engine, but with varying time slices. So 10% usage for application A could mean something