Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-05 Thread Christian König
Am 04.01.22 um 19:08 schrieb Felix Kuehling: [+Adrian] Am 2021-12-23 um 2:05 a.m. schrieb Christian König: Am 22.12.21 um 21:53 schrieb Daniel Vetter: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote: [SNIP] Still sounds funky. I think minimally we should have an ack from C

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Lazar, Lijo
On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function directly since it's already executed from

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Christian König
Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency there. For TDR call the original recovery function d

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Lazar, Lijo
On 1/5/2022 6:01 PM, Christian König wrote: Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency ther

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Christian König
Am 05.01.22 um 14:11 schrieb Lazar, Lijo: On 1/5/2022 6:01 PM, Christian König wrote: Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU re

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Lazar, Lijo
On 1/5/2022 6:45 PM, Christian König wrote: Am 05.01.22 um 14:11 schrieb Lazar, Lijo: On 1/5/2022 6:01 PM, Christian König wrote: Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysf

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Christian König
Am 05.01.22 um 14:26 schrieb Lazar, Lijo: On 1/5/2022 6:45 PM, Christian König wrote: Am 05.01.22 um 14:11 schrieb Lazar, Lijo: On 1/5/2022 6:01 PM, Christian König wrote: Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for n

Re: [Patch v4 23/24] drm/amdkfd: CRIU prepare for svm resume

2022-01-05 Thread philip yang
On 2021-12-22 7:37 p.m., Rajneesh Bhardwaj wrote: During CRIU restore phase, the VMAs for the virtual address ranges are not at their final location yet so in this stage, only cache the data required to successfully resume the svm ranges during an imminent CR

Re: [Patch v4 21/24] drm/amdkfd: CRIU Discover svm ranges

2022-01-05 Thread philip yang
On 2021-12-22 7:37 p.m., Rajneesh Bhardwaj wrote: A KFD process may contain a number of virtual address ranges for shared virtual memory management and each such range can have many SVM attributes spanning across various nodes within the process boundary. Thi

Re: [Patch v4 18/24] drm/amdkfd: CRIU checkpoint and restore xnack mode

2022-01-05 Thread philip yang
On 2021-12-22 7:37 p.m., Rajneesh Bhardwaj wrote: Recoverable page faults are represented by the xnack mode setting inside a kfd process and are used to represent the device page faults. For CR, we don't consider negative values which are typically used for q

Re: [PATCH] drm/amdkfd: Check for null pointer after calling kmemdup

2022-01-05 Thread Felix Kuehling
Am 2022-01-05 um 4:09 a.m. schrieb Jiasheng Jiang: > As the possible failure of the allocation, kmemdup() may return NULL > pointer. > Therefore, it should be better to check the 'props2' in order to prevent > the dereference of NULL pointer. > > Fixes: 3a87177eb141 ("drm/amdkfd: Add topology suppo

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-05 Thread Felix Kuehling
Am 2022-01-05 um 3:08 a.m. schrieb Christian König: > Am 04.01.22 um 19:08 schrieb Felix Kuehling: >> [+Adrian] >> >> Am 2021-12-23 um 2:05 a.m. schrieb Christian König: >> >>> Am 22.12.21 um 21:53 schrieb Daniel Vetter: On Mon, Dec 20, 2021 at 01:12:51PM -0500, Bhardwaj, Rajneesh wrote:

Re: [PATCH] drm/ttm: Don't inherit GEM object VMAs in child process

2022-01-05 Thread Felix Kuehling
Am 2022-01-05 um 11:16 a.m. schrieb Felix Kuehling: >> I was already wondering which mmaps through the KFD node we have left >> which cause problems here. > We still use the KFD FD for mapping doorbells and HDP flushing. These > are both SG BOs, so they cannot be CPU-mapped through render nodes. Th

[PATCH v2] drm/amd/display: explicitly update clocks when DC is set to D3 in s0i3

2022-01-05 Thread Mario Limonciello
The WA from commit 5965280abd30 ("drm/amd/display: Apply w/a for hard hang on HPD") causes a regression in s0ix where the system will fail to resume properly. This may be because an HPD was active the last time clocks were updated but clocks didn't get updated again during s0ix. So add an extra c

Re: [RFC v2 4/8] drm/amdgpu: Serialize non TDR gpu recovery with TDRs

2022-01-05 Thread Andrey Grodzovsky
On 2022-01-05 7:31 a.m., Christian König wrote: Am 05.01.22 um 10:54 schrieb Lazar, Lijo: On 12/23/2021 3:35 AM, Andrey Grodzovsky wrote: Use reset domain wq also for non TDR gpu recovery trigers such as sysfs and RAS. We must serialize all possible GPU recoveries to gurantee no concurrency th

Re: [PATCH v2] drm/amdgpu: Unmap MMIO mappings when device is not unplugged

2022-01-05 Thread Andrey Grodzovsky
On 2022-01-04 11:23 p.m., Leslie Shi wrote: Patch: 3efb17ae7e92 ("drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure") makes call to amdgpu_device_unmap_mmio() conditioned on device unplugged. This patch unmaps MMIO mappings even wh

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread Andrey Grodzovsky
On 2022-01-05 2:59 a.m., Christian König wrote: Am 05.01.22 um 08:34 schrieb JingWen Chen: On 2022/1/5 上午12:56, Andrey Grodzovsky wrote: On 2022-01-04 6:36 a.m., Christian König wrote: Am 04.01.22 um 11:49 schrieb Liu, Monk: [AMD Official Use Only] See the FLR request from the hypervisor i

Re: [PATCH] drm/amdkfd: make SPDX License expression more sound

2022-01-05 Thread Felix Kuehling
Am 2021-12-16 um 4:45 a.m. schrieb Lukas Bulwahn: > Commit b5f57384805a ("drm/amdkfd: Add sysfs bitfields and enums to uAPI") > adds include/uapi/linux/kfd_sysfs.h with the "GPL-2.0 OR MIT WITH > Linux-syscall-note" SPDX-License expression. > > The command ./scripts/spdxcheck.py warns: > > includ

[PATCH] drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2

2022-01-05 Thread Harry Wentland
For some reason this file isn't using the appropriate register headers for DCN headers, which means that on DCN2 we're getting the VIEWPORT_DIMENSION offset wrong. This means that we're not correctly carving out the framebuffer memory correctly for a framebuffer allocated by EFI and therefore see

Re: [PATCH v2] drm/amd/display: explicitly update clocks when DC is set to D3 in s0i3

2022-01-05 Thread Harry Wentland
On 2022-01-05 12:06, Mario Limonciello wrote: > The WA from commit 5965280abd30 ("drm/amd/display: Apply w/a for > hard hang on HPD") causes a regression in s0ix where the system will > fail to resume properly. This may be because an HPD was active the last > time clocks were updated but clocks di

RE: [PATCH v2] drm/amd/display: explicitly update clocks when DC is set to D3 in s0i3

2022-01-05 Thread Limonciello, Mario
[Public] > -Original Message- > From: Wentland, Harry > Sent: Wednesday, January 5, 2022 15:26 > To: Limonciello, Mario ; amd- > g...@lists.freedesktop.org > Cc: Zhuo, Qingqing (Lillian) ; Scott Bruce > ; Chris Hixon ; > spassw...@web.de > Subject: Re: [PATCH v2] drm/amd/display: explic

[PATCH v3 1/2] drm/amd/display: Add power_state member into dc_state

2022-01-05 Thread Mario Limonciello
This can be used by the display core to let decisions be made based upon the requested power state. Cc: Qingqing Zhuo Cc: Scott Bruce Cc: Chris Hixon Cc: spassw...@web.de Signed-off-by: Mario Limonciello --- changes from v2->v3: * New patch drivers/gpu/drm/amd/display/dc/core/dc.c| 2

[PATCH v3 2/2] drm/amd/display: Use requested power state to avoid HPD WA during s0ix

2022-01-05 Thread Mario Limonciello
The WA from commit 5965280abd30 ("drm/amd/display: Apply w/a for hard hang on HPD") causes a regression in s0ix where the system will fail to resume properly. This may be because an HPD was active the last time clocks were updated but clocks didn't get updated again during s0ix. So add an extra c

Re: [RFC PATCH 0/3] Add support modifiers for drivers whose planes only support linear layout

2022-01-05 Thread Simon Ser
Thanks for working on this! I've pushed a patch [1] to drm-misc-next which touches the same function, can you rebase your patches on top of it? [1]: https://patchwork.freedesktop.org/patch/467940/?series=98255&rev=3

Re: [PATCH] drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2

2022-01-05 Thread Huang Rui
On Thu, Jan 06, 2022 at 04:39:01AM +0800, Harry Wentland wrote: > For some reason this file isn't using the appropriate register > headers for DCN headers, which means that on DCN2 we're getting > the VIEWPORT_DIMENSION offset wrong. > > This means that we're not correctly carving out the framebuf

RE: [PATCH v2] drm/amdgpu: Unmap MMIO mappings when device is not unplugged

2022-01-05 Thread Shi, Leslie
[AMD Official Use Only] Hi Andrey, It is the following patch calls to amdgpu_device_unmap_mmio() conditioned on device unplugged. 3efb17ae7e92 "drm/amdgpu: Call amdgpu_device_unmap_mmio() if device is unplugged to prevent crash in GPU initialization failure" Regards, Leslie -Original Mes

[PATCH] drm/amdgpu: Add interface to load SRIOV cap FW

2022-01-05 Thread Bokun Zhang
- Add interface to load SRIOV cap FW. If the FW does not exist, simply skip this FW loading routine. This FW will only be loaded under SRIOV. Other driver setup will not be affected. By adding this interface, it will make us easier to prepare SRIOV Linux guest driver for different users.

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
On 2022/1/6 上午2:24, Andrey Grodzovsky wrote: > > On 2022-01-05 2:59 a.m., Christian König wrote: >> Am 05.01.22 um 08:34 schrieb JingWen Chen: >>> On 2022/1/5 上午12:56, Andrey Grodzovsky wrote: On 2022-01-04 6:36 a.m., Christian König wrote: > Am 04.01.22 um 11:49 schrieb Liu, Monk: >

Re: [RFC v2 8/8] drm/amd/virt: Drop concurrent GPU reset protection for SRIOV

2022-01-05 Thread JingWen Chen
On 2022/1/6 下午12:59, JingWen Chen wrote: > On 2022/1/6 上午2:24, Andrey Grodzovsky wrote: >> On 2022-01-05 2:59 a.m., Christian König wrote: >>> Am 05.01.22 um 08:34 schrieb JingWen Chen: On 2022/1/5 上午12:56, Andrey Grodzovsky wrote: > On 2022-01-04 6:36 a.m., Christian König wrote: >>

[PATCH 0/7] Drop unnecessary power related lock protections

2022-01-05 Thread Evan Quan
A unified lock protection mechanism was already enforced on those APIs from amdgpu_dpm.c. Thus those extra internal lock protections will be unnecessary and can be dropped. Evan Quan (7): drm/amd/pm: drop unneeded lock protection smu->mutex drm/amd/pm: drop unneeded vcn/jpeg_gate_lock drm/am

[PATCH 2/7] drm/amd/pm: drop unneeded vcn/jpeg_gate_lock

2022-01-05 Thread Evan Quan
As those related APIs are already protected by adev->pm.mutex. Signed-off-by: Evan Quan Change-Id: I762fab96bb1c034c153b029f939ec6e498460007 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 56 +++ drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 2 - 2 files changed, 8 insert

[PATCH 1/7] drm/amd/pm: drop unneeded lock protection smu->mutex

2022-01-05 Thread Evan Quan
As all those APIs are already protected either by adev->pm.mutex or smu->message_lock. Signed-off-by: Evan Quan Change-Id: I1db751fba9caabc5ca1314992961d3674212f9b0 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 315 ++ drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 -

[PATCH 4/7] drm/amd/pm: drop unneeded smu->sensor_lock

2022-01-05 Thread Evan Quan
As all those related APIs are already well protected by adev->pm.mutex and smu->message_lock. Signed-off-by: Evan Quan Change-Id: I20974b2ae68d63525bc7c7f406fede2971c5fecc --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h |

[PATCH 3/7] drm/amd/pm: drop unneeded smu->metrics_lock

2022-01-05 Thread Evan Quan
As all those related APIs are already well protected by adev->pm.mutex and smu->message_lock. Signed-off-by: Evan Quan Change-Id: Ic75326ba7b4b67be8762d5407d02f6c514e1ad35 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/dr

[PATCH 6/7] drm/amd/pm: drop unneeded feature->mutex

2022-01-05 Thread Evan Quan
As all those related APIs are already well protected by adev->pm.mutex. Signed-off-by: Evan Quan Change-Id: Ia2c752ff22e8f23601484f48b66151cfda8c01b5 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/drm/amd/pm/swsmu/smu13/smu

[PATCH 5/7] drm/amd/pm: drop unneeded smu_baco->mutex

2022-01-05 Thread Evan Quan
As those APIs related are already well protected by adev->pm.mutex. Signed-off-by: Evan Quan Change-Id: I8a7d8da5710698a98dd0f7e70c244be57474b573 --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 1 - drivers/gpu/drm/amd/pm/swsmu/inc/amdgpu_smu.h | 1 - .../gpu/drm/amd/pm/swsmu/smu11/smu_v11

[PATCH 7/7] drm/amd/pm: drop unneeded hwmgr->smu_lock

2022-01-05 Thread Evan Quan
As all those related APIs are already well protected by adev->pm.mutex. Signed-off-by: Evan Quan Change-Id: I36426791d3bbc9d84a6ae437da26a892682eb0cb --- .../gpu/drm/amd/pm/powerplay/amd_powerplay.c | 282 +++--- drivers/gpu/drm/amd/pm/powerplay/inc/hwmgr.h | 1 - 2 files changed

[PATCH] drm/amdgpu: Enable second VCN for certain Navi2x.

2022-01-05 Thread Peng Ju Zhou
Certain navi2x cards have 2 VCNs, enable it. Signed-off-by: Peng Ju Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c index 58

Re: [PATCH 3/7] drm/amd/pm: drop unneeded smu->metrics_lock

2022-01-05 Thread Lazar, Lijo
On 1/6/2022 11:27 AM, Evan Quan wrote: As all those related APIs are already well protected by adev->pm.mutex and smu->message_lock. This one may be widely used. Instead of relying on pm.mutex it's better to keep metrics lock so that multiple clients can read data without waiting on other

RE: [PATCH 3/7] drm/amd/pm: drop unneeded smu->metrics_lock

2022-01-05 Thread Quan, Evan
[AMD Official Use Only] > -Original Message- > From: Lazar, Lijo > Sent: Thursday, January 6, 2022 2:17 PM > To: Quan, Evan ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH 3/7] drm/amd/pm: drop unneeded smu->metrics_lock > > > > On 1/6/2022 11:27 AM, E

[PATCH v2] drm/amdgpu: Enable second VCN for certain Navy Flounder.

2022-01-05 Thread Peng Ju Zhou
Certain Navy Flounder cards have 2 VCNs, enable it. Signed-off-by: Peng Ju Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c i