[PATCH] drm/amdgpu: access ltr through pci cfg space

2024-06-19 Thread Min, Frank
[AMD Official Use Only - AMD Internal Distribution Only] From: Frank Min Access ltr through pci cfg space instead of mmio while programing aspm on gfx12 Signed-off-by: Frank Min --- drivers/gpu/drm/amd/amdgpu/nbif_v6_3_1.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-)

[PATCH] drm/amdgpu: process RAS fatal error MB notification

2024-06-19 Thread Vignesh Chander
For RAS error scenario, VF guest driver will check mailbox and set fed flag to avoid unnecessary HW accesses. additionally, poll for reset completion message first to avoid accidentally spamming multiple reset requests to host. v2: add another mailbox check for handling case where kfd detects time

[PATCH V2 4/4] drm/amdgpu: add gpu reset check and exception handling

2024-06-19 Thread YiPeng Chai
Add gpu reset check and exception handling for page retirement. v2: Clear poison consumption messages cached in fifo after non mode-1 reset. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 52 + 1 file changed, 52 insertions(+) diff --git a/dr

[PATCH V2 2/4] drm/amdgpu: refine poison creation interrupt handler

2024-06-19 Thread YiPeng Chai
In order to apply to the case where a large number of ras poison interrupts: 1. Change to use variable to record poison creation requests to avoid fifo full. 2. Prioritize handling poison creation requests instead of following the order of requests received by the driver. Signed-off-by: Y

[PATCH V2 3/4] drm/amdgpu: refine poison consumption interrupt handler

2024-06-19 Thread YiPeng Chai
1. The poison fifo is only used for poison consumption requests. 2. Merge reset requests when poison fifo caches multiple poison consumption messages Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 56 - drivers/gpu/drm/amd/amdgpu/amdgpu_umc

[PATCH V2 1/4] drm/amdgpu: add variable to record the deferred error number read by driver

2024-06-19 Thread YiPeng Chai
Add variable to record the deferred error number read by driver. Signed-off-by: YiPeng Chai --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 62 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h | 3 +- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 4 +- 3 files changed, 48 insertions

[PATCH 2/2] Revert "drm/amd/amdgpu: add module parameter for jpeg"

2024-06-19 Thread Kenneth Feng
This reverts commit 63400bcf5cb23b6a9b674eb3f2d733d826860065. Revert this due to a final solution in amdgu vcn: commit eef47ed5f703377781ce89eae4b9140325049873 Author: Sonny Jiang Date: Tue Jun 18 11:11:11 2024 -0400 drm/amdgpu/jpeg5: reprogram doorbell setting after power up for each playback

[PATCH 1/2] Revert "drm/amd/pm: workaround to pass jpeg unit test"

2024-06-19 Thread Kenneth Feng
This reverts commit a03b8169582453c01cbf76d8a92a8194d3421b13. Revert this due to a final solution in amdgpu vcn: commit eef47ed5f703377781ce89eae4b9140325049873 Author: Sonny Jiang Date: Tue Jun 18 11:11:11 2024 -0400 drm/amdgpu/jpeg5: reprogram doorbell setting after power up for each pla

[PATCH] drm/amdgpu: track bo memory stats at runtime

2024-06-19 Thread Yunxiang Li
Before, every time fdinfo is queried we try to lock all the BOs in the VM and calculate memory usage from scratch. This works okay if the fdinfo is rarely read and the VMs don't have a ton of BOs. If either of these conditions is not true, we get a massive performance hit. In this new revision, we

[pull] amdgpu drm-fixes-6.10

2024-06-19 Thread Alex Deucher
Hi Dave, Sima, Fixes for 6.10. Two weeks worth. The following changes since commit 6ba59ff4227927d3a8530fc2973b80e94b54d58f: Linux 6.10-rc4 (2024-06-16 13:40:16 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-6.10-2024-06-1

Re: [PATCH 4/6] drm/amdgpu: add AMDGPU_INFO_GB_ADDR_CONFIG query

2024-06-19 Thread Marek Olšák
The INFO ioctl was designed to allow increasing the sizes of all info structures. GB_ADDR_CONFIG isn't that special to justify a separate query. Marek On Wed, Jun 19, 2024 at 5:31 AM Christian König wrote: > > I would try to avoid that. > > Putting everything into amdgpu_info_device was a mistak

[PATCH] drm/amdgpu: part I - normalize registers as local xcc to read/write under sriov in TLB

2024-06-19 Thread Jane Jian
[WHY] sriov has the higher bit violation when flushing tlb [HOW] normalize the registers to keep lower 16-bit(dword aligned) to aviod higher bit violation RLCG will mask xcd out and always assume it's accessing its own xcd [TODO] later will add the normalization in sriovw/rreg after fixing bugs

Re: [PATCH] drm/radeon: remove load callback

2024-06-19 Thread Thomas Zimmermann
Hi Am 07.06.24 um 03:14 schrieb wu hoi pok: this patch is to remove the load callback from the kms_driver, following closly to amdgpu, radeon_driver_load_kms and devm_drm_dev_alloc are used, most of the changes here are rdev->ddev to rdev_to_drm, which maps to adev_to_drm in amdgpu. however this

Re: [PATCH v2 8/8] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time

2024-06-19 Thread Alex Deucher
On Wed, Jun 19, 2024 at 9:50 AM Alex Deucher wrote: > > On Tue, Jun 18, 2024 at 7:53 PM Doug Anderson wrote: > > > > Hi, > > > > On Tue, Jun 18, 2024 at 3:00 PM Alex Deucher wrote: > > > > > > On Tue, Jun 18, 2024 at 5:40 PM Doug Anderson > > > wrote: > > > > > > > > Hi, > > > > > > > > > > >

Re: [PATCH v2 8/8] drm/amdgpu: Call drm_atomic_helper_shutdown() at shutdown time

2024-06-19 Thread Alex Deucher
On Tue, Jun 18, 2024 at 7:53 PM Doug Anderson wrote: > > Hi, > > On Tue, Jun 18, 2024 at 3:00 PM Alex Deucher wrote: > > > > On Tue, Jun 18, 2024 at 5:40 PM Doug Anderson wrote: > > > > > > Hi, > > > > > > > > > On Mon, Jun 17, 2024 at 8:01 AM Alex Deucher > > > wrote: > > > > > > > > On Wed,

Re: [PATCH v6 0/8] drm: Support per-plane async flip configuration

2024-06-19 Thread Ville Syrjälä
On Fri, Jun 14, 2024 at 04:37:41PM -0300, André Almeida wrote: > Hi Dmitry, > > Em 14/06/2024 14:32, Dmitry Baryshkov escreveu: > > On Fri, Jun 14, 2024 at 12:35:27PM GMT, André Almeida wrote: > >> AMD hardware can do async flips with overlay planes, but currently there's > >> no > >> easy way to

Re: [PATCH] drm/amdgpu: process RAS fatal error MB notification

2024-06-19 Thread Lazar, Lijo
On 6/19/2024 2:44 AM, Vignesh Chander wrote: > For RAS error scenario, VF guest driver will check mailbox > and set fed flag to avoid unnecessary HW accesses. > additionally, poll for reset completion message first > to avoid accidentally spamming multiple reset requests to host. > > v2: add an

Re: [PATCH] drm/amdgpu: part I - normalize registers as local xcc to read/write under sriov in TLB

2024-06-19 Thread Lazar, Lijo
On 6/19/2024 3:25 PM, Jane Jian wrote: > [WHY] > sriov has the higher bit violation when flushing tlb > > [HOW] > normalize the registers to keep lower 16-bit(dword aligned) to aviod higher > bit violation > RLCG will mask xcd out and always assume it's accessing its own xcd > > [TODO] > late

RE: [PATCH] drm/amdgpu: part I - normalize registers as local xcc to read/write under sriov in TLB

2024-06-19 Thread Jian, Jane
[AMD Official Use Only - AMD Internal Distribution Only] + mark Hi Lijo @Lazar, Lijo, Please help review this part I patch. For sriov read/write part, we find a bug while masking the offset, which needs time to debug and later I will submit patch II. Thanks, Jane -Original Message- From:

[PATCH] drm/amdgpu: part I - normalize registers as local xcc to read/write under sriov in TLB

2024-06-19 Thread Jane Jian
[WHY] sriov has the higher bit violation when flushing tlb [HOW] normalize the registers to keep lower 16-bit(dword aligned) to aviod higher bit violation RLCG will mask xcd out and always assume it's accessing its own xcd [TODO] later will add the normalization in sriovw/rreg after fixing bugs

Re: [PATCH 4/6] drm/amdgpu: add AMDGPU_INFO_GB_ADDR_CONFIG query

2024-06-19 Thread Christian König
I would try to avoid that. Putting everything into amdgpu_info_device was a mistake only done because people assumed that IOCTLs on Linux are to expensive to query all information separately. We should rather have distinct IOCTLs for each value because that is way more flexible and we won't

Re: [PATCH 1/6] drm/amdgpu: allow ioctls to opt-out of runtime pm

2024-06-19 Thread Christian König
Am 18.06.24 um 17:23 schrieb Pierre-Eric Pelloux-Prayer: Waking up a device can take multiple seconds, so if it's not going to be used we might as well not resume it. The safest default behavior for all ioctls is to resume the GPU, so this change allows specific ioctls to opt-out of generic runt

Re: [PATCH AUTOSEL 5.10 1/7] drm/amd/display: Exit idle optimizations before HDCP execution

2024-06-19 Thread Pavel Machek
Hi! > [WHY] > PSP can access DCN registers during command submission and we need > to ensure that DCN is not in PG before doing so. > > [HOW] > Add a callback to DM to lock and notify DC for idle optimization exit. > It can't be DC directly because of a potential race condition with the > link pr

Re: [PATCH AUTOSEL 6.1 13/14] drm/amdgpu: fix dereference null return value for the function amdgpu_vm_pt_parent

2024-06-19 Thread Pavel Machek
Hi! > [ Upstream commit a0cf36546cc24ae1c95d72253c7795d4d2fc77aa ] > > The pointer parent may be NULLed by the function amdgpu_vm_pt_parent. > To make the code more robust, check the pointer parent. If this can happen, it should not WARN(). If this can not happen, we don't need the patch in sta