[PATCH] drm/amdgpu: Remove gfxoff check in GFX v9.4.3

2023-08-14 Thread Lijo Lazar
GFXOFF feature is not there for GFX 9.4.3 ASICs. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c index 564770c3875e..4bbe9c5ed87f 100644

[PATCH] Documentation/gpu: Update amdgpu documentation

2023-08-15 Thread Lijo Lazar
7957ec80ef97 ("drm/amdgpu: Add FRU sysfs nodes only if needed") moved the documentation for some of the sysfs nodes to amdgpu_fru_eeprom.c. Update the documentation accordingly. Signed-off-by: Lijo Lazar --- Documentation/gpu/amdgpu/driver-misc.rst | 6 +++--- 1 file changed, 3 insert

[PATCH] drm/amdgpu: Remove gfxoff check in GFX v9.4.3

2023-08-15 Thread Lijo Lazar
GFXOFF feature is not there for GFX 9.4.3 ASICs. Signed-off-by: Lijo Lazar Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c index

[PATCH] drm/amdgpu: Add only valid firmware version nodes

2023-08-25 Thread Lijo Lazar
Show only firmware version attributes that have valid version. Hide others. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 33 --- 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers

[PATCH] drm/amdgpu: Save VCN shared memory with init reset

2024-10-14 Thread Lijo Lazar
function. Signed-off-by: Lijo Lazar Reported-by: Hao Zhou Fixes: 1b665567fd6d ("drm/amdgpu: Add reset on init handler for XGMI") --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c | 26 ++- drivers/gpu/drm/amd/amdgpu/am

[PATCH] drm/amdgpu: Zero-initialize mqd backup memory

2024-10-14 Thread Lijo Lazar
Zero-initialize mqd backup memory, otherwise the check for 'already-backed-up' could go wrong. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/d

[PATCH] drm/amdgpu: Use SPX as default in partition config

2024-10-14 Thread Lijo Lazar
In certain cases - ex: when a reset is required on initialization - XCP manager won't have a valid partition mode. In such cases, use SPX as the default selected mode for which partition configuration details are populated. Signed-off-by: Lijo Lazar Reported-by: Hao Zhou Fixes: c7de570

[PATCH v2] drm/amdgpu: Add NPS switch support for GC 9.4.3

2024-10-08 Thread Lijo Lazar
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- v2: Add NULL check for

[PATCH] drm/amdgpu: Wait for reset on init completion

2024-10-07 Thread Lijo Lazar
When reset on initialization is requested, wait for the reset to finish. In cases where module is loaded after boot, this makes sure all initialization work is done after a successful return of modprobe. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 9 - 1

[PATCH] drm/amdgpu: Fix the logic for NPS request failure

2024-10-17 Thread Lijo Lazar
On a hive, NPS request is placed by the first one for all devices in the hive. If the request fails, mark the mode as UNKNOWN so that subsequent devices on unload don't request it. Also, fix the mutex double lock issue in error condition, should have been mutex_unlock. Signed-off-by: Lijo

[PATCH v2] drm/amdgpu: Fix the logic for NPS request failure

2024-10-17 Thread Lijo Lazar
On a hive, NPS request is placed by the first one for all devices in the hive. If the request fails, mark the mode as UNKNOWN so that subsequent devices on unload don't request it. Also, fix the mutex double lock issue in error condition, should have been mutex_unlock. Signed-off-by: Lijo

[PATCH v2] drm/amdgpu: Save VCN shared memory with init reset

2024-10-17 Thread Lijo Lazar
function. Signed-off-by: Lijo Lazar Reported-by: Hao Zhou Fixes: 1b665567fd6d ("drm/amdgpu: Add reset on init handler for XGMI") --- v2: Rename save function to a more appropriate amdgpu_vcn_save_vcpu_bo (Leo) drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 6 ++ drivers/gpu/drm/

[PATCH] drm/amdgpu: Group gfx sysfs functions

2024-10-28 Thread Lijo Lazar
Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all gfx related sysfs nodes. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 37 ++--- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.h | 2 -- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 5

[PATCH 1/2] drm/amdgpu: Fix unmap queue logic

2024-11-04 Thread Lijo Lazar
code. Signed-off-by: Lijo Lazar Fixes: 6c10b5cc4eaa ("drm/amdgpu: Remove duplicate code in gfx_v8_0.c") --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 47 ++ drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c |

[PATCH 2/2] drm/amdgpu: Avoid kcq disable during reset

2024-11-04 Thread Lijo Lazar
Reset sequence indicates that hardware already ran into a bad state. Avoid sending unmap queue request to reset KCQ. This will also cover RAS error scenarios which need a reset to recover, hence remove the check. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10

[PATCH] drm/amdgpu: Fix DPX valid mode check on GC 9.4.3

2024-11-03 Thread Lijo Lazar
For DPX mode, the number of memory partitions supported should be less than or equal to 2. Signed-off-by: Lijo Lazar Fixes: 1589c82a1085 ("drm/amdgpu: Check memory ranges for valid xcp mode") --- drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 2 +- 1 file changed, 1 insertion(+),

[PATCH] drm/amdgpu: Skip IP coredump for RAS errors

2024-11-03 Thread Lijo Lazar
For RAS errors, source of error is known. Skip the core dump of IP states. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index

[PATCH v2 2/2] drm/amdgpu: Avoid kcq disable during reset

2024-11-04 Thread Lijo Lazar
Reset sequence indicates that hardware already ran into a bad state. Avoid sending unmap queue request to reset KCQ. This will also cover RAS error scenarios which need a reset to recover, hence remove the check. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10

[PATCH v2 1/2] drm/amdgpu: Fix map/unmap queue logic

2024-11-04 Thread Lijo Lazar
newer code. Signed-off-by: Lijo Lazar Fixes: 6c10b5cc4eaa ("drm/amdgpu: Remove duplicate code in gfx_v8_0.c") --- v2: Add same changes to map queue also (Le Ma) drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 13 - drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 63 +++--

[PATCH v2] drm/amdgpu: Group gfx sysfs functions

2024-10-29 Thread Lijo Lazar
Make amdgpu_gfx_sysfs_init/fini functions as common entry points for all gfx related sysfs nodes. Signed-off-by: Lijo Lazar --- v2: Check cleaner shader capability only for creation of run_cleaner_shader attribute. drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 36 - drivers

[PATCH] drm/amdgpu: Add compatible NPS mode info

2024-10-30 Thread Lijo Lazar
Populate the compatible NPS modes also for providing partition configuration details through sysfs. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_xcp.h| 1 + drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 11 +++ 2 files changed, 12 insertions(+) diff --git a

[PATCH 0/7] Add support for dynamic NPS switch

2024-09-23 Thread Lijo Lazar
ch is pending and initiates a mode-1 reset. 7) During resume after a reset, NPS ranges are read again from discovery table. 8) Driver detects the new NPS mode and makes a compatible compute partition mode switch if required. Lijo Lazar (7): drm/amdgpu: Add option to refresh NPS data drm/amdgpu: Ad

[PATCH 1/7] drm/amdgpu: Add option to refresh NPS data

2024-09-23 Thread Lijo Lazar
In certain use cases, NPS data needs to be refreshed again from discovery table. Add API parameter to refresh NPS data from discovery table. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.h | 2

[PATCH 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-23 Thread Lijo Lazar
Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd/amdgpu/psp_gfx_if.h | 1

[PATCH 3/7] drm/amdgpu: Add gmc interface to request NPS mode

2024-09-23 Thread Lijo Lazar
Add a common interface in GMC to request NPS mode through PSP. Also add a variable in hive and gmc control to track the last requested mode. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16 drivers/gpu/drm/amd/amdgpu

[PATCH 5/7] drm/amdgpu: Place NPS mode request on unload

2024-09-23 Thread Lijo Lazar
If a user has requested NPS mode switch, place the request through PSP during unload of the driver. For devices which are part of a hive, all requests are placed together. If one of them fails, revert back to the current NPS mode. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj

[PATCH 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-23 Thread Lijo Lazar
Add a callback to check if there is any condition detected by GMC block for reset on init. One case is if a pending NPS change request is detected. If reset is done because of NPS switch, refresh NPS info from discovery table. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu

[PATCH 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3

2024-09-23 Thread Lijo Lazar
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_nbio.h | 1 + drivers

[PATCH 4/7] drm/amdgpu: Add sysfs interfaces for NPS mode

2024-09-23 Thread Lijo Lazar
memory partition sysfs logic to be more generic. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++ 2 files changed, 104 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/

[PATCH 1/2] drm/amdgpu: Fetch NPS mode for GCv9.4.3 VFs

2024-09-23 Thread Lijo Lazar
Use the memory ranges published in discovery table to deduce NPS mode of GC v9.4.3 VFs. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 12 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 30

[PATCH 2/2] drm/amdgpu: Show current compute partition on VF

2024-09-23 Thread Lijo Lazar
Enable sysfs node for current compute partition mode on VFs also. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 29 +++-- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 12 -- 2 files changed, 31 insertions(+), 10 deletions(-) diff --git a

[PATCH v2 0/7] Add support for dynamic NPS switch

2024-09-26 Thread Lijo Lazar
eifei) Lijo Lazar (7): drm/amdgpu: Add option to refresh NPS data drm/amdgpu: Add PSP interface for NPS switch drm/amdgpu: Add gmc interface to request NPS mode drm/amdgpu: Add sysfs interfaces for NPS mode drm/amdgpu: Place NPS mode request on unload drm/amdgpu: Check gmc requiremen

[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-26 Thread Lijo Lazar
Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd

[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data

2024-09-26 Thread Lijo Lazar
In certain use cases, NPS data needs to be refreshed again from discovery table. Add API parameter to refresh NPS data from discovery table. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++ drivers/gpu/drm/amd

[PATCH v2 5/7] drm/amdgpu: Place NPS mode request on unload

2024-09-26 Thread Lijo Lazar
If a user has requested NPS mode switch, place the request through PSP during unload of the driver. For devices which are part of a hive, all requests are placed together. If one of them fails, revert back to the current NPS mode. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj

[PATCH v2 6/7] drm/amdgpu: Check gmc requirement for reset on init

2024-09-26 Thread Lijo Lazar
Add a callback to check if there is any condition detected by GMC block for reset on init. One case is if a pending NPS change request is detected. If reset is done because of NPS switch, refresh NPS info from discovery table. Signed-off-by: Lijo Lazar --- v2: Move NPS request check ahead of TOS

[PATCH v2 4/7] drm/amdgpu: Add sysfs interfaces for NPS mode

2024-09-26 Thread Lijo Lazar
memory partition sysfs logic to be more generic. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 114 drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 6 ++ 2 files changed, 104 insertions(+), 16 deletions(-) diff -

[PATCH v2 2/7] drm/amdgpu: Add PSP interface for NPS switch

2024-09-26 Thread Lijo Lazar
Implement PSP ring command interface for memory partitioning on the fly on the supported asics. Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 25 + drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h | 1 + drivers/gpu/drm/amd

[PATCH v2 0/7] Add support for dynamic NPS switch

2024-09-26 Thread Lijo Lazar
eifei) Lijo Lazar (7): drm/amdgpu: Add option to refresh NPS data drm/amdgpu: Add PSP interface for NPS switch drm/amdgpu: Add gmc interface to request NPS mode drm/amdgpu: Add sysfs interfaces for NPS mode drm/amdgpu: Place NPS mode request on unload drm/amdgpu: Check gmc requiremen

[PATCH v2 1/7] drm/amdgpu: Add option to refresh NPS data

2024-09-26 Thread Lijo Lazar
In certain use cases, NPS data needs to be refreshed again from discovery table. Add API parameter to refresh NPS data from discovery table. Signed-off-by: Lijo Lazar Reviewed-by: Rajneesh Bhardwaj --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 68 +++ drivers/gpu/drm/amd

[PATCH v2 7/7] drm/amdgpu: Add NPS switch support for GC 9.4.3

2024-09-26 Thread Lijo Lazar
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar Signed-off-by: Rajneesh Bhardwaj Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu

[PATCH v2 3/7] drm/amdgpu: Add gmc interface to request NPS mode

2024-09-26 Thread Lijo Lazar
Add a common interface in GMC to request NPS mode through PSP. Also add a variable in hive and gmc control to track the last requested mode. Signed-off-by: Rajneesh Bhardwaj Signed-off-by: Lijo Lazar Reviewed-by: Feifei Xu --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 16

[PATCH] drm/amdgpu: Fix logic to determine TOS reload

2024-09-30 Thread Lijo Lazar
Avoid comparing TOS version on APUs. On APUs driver doesn't take care of TOS load. Fixes: 2edc5ecbf1a9 ("drm/amdgpu: Add interface for TOS reload cases") Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-

[PATCH] drm/amdgpu: Simplify cleanup check for FRU sysfs

2024-11-28 Thread Lijo Lazar
FRU info is expected to be non-NULL if FRU sys files are created. Simplify the check. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.c b/drivers/gpu

[PATCH] drm/amdkfd: Use the correct wptr size

2024-11-18 Thread Lijo Lazar
Write pointer could be 32-bit or 64-bit. Use the correct size during initialization. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm

[PATCH] drm/amd/pm: Remove arcturus min power limit

2024-11-19 Thread Lijo Lazar
As per power team, there is no need to impose a lower bound on arcturus power limit. Any unreasonable limit set will result in frequent throttling. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff

[PATCH] drm/amd/pm: Remove arcturus min power limit

2024-11-19 Thread Lijo Lazar
As per power team, there is no need to impose a lower bound on arcturus power limit. Any unreasonable limit set will result in frequent throttling. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu11/arcturus_ppt.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff

[PATCH 2/2] drm/amdgpu: Check whether in reset recovery state

2024-11-15 Thread Lijo Lazar
Some in_reset checks are infact checking whether the state is reinitialization after reset. Replace with reset_in_recovery calls to identify that it's really checking for recovery stage after reset. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- driver

[PATCH 1/2] drm/amdgpu: Add init level for post reset reinit

2024-11-15 Thread Lijo Lazar
o identify post reset reinitialization phase. This only provides a device level identification, IP/features may choose to track their state independently also. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/aldebaran.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + d

[PATCH] drm/amdgpu: Prefer RAS recovery for scheduler hang

2024-11-17 Thread Lijo Lazar
ed to look for a fatal error. Skip fatal error checking in such cases. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/aldebaran.c| 2 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 15 - drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 55 ++- drivers/gp

[PATCH] drm/amdgpu: Remove gfxoff usage

2024-11-26 Thread Lijo Lazar
GFXOFF is not valid for these IP versions. Also, SDMA v4.4.2 is not in GFX domain. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 4 drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 -- 2 files changed, 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amd/pm: Revert state if force level fails

2024-12-06 Thread Lijo Lazar
-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 58 + 1 file changed, 35 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/amdgpu_dpm.c b/drivers/gpu/drm/amd/pm/amdgpu_dpm.c index 4d90e3f0bd17..6a9e26905edf 100644 --- a/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Avoid VF for RAS recovery source check

2024-12-09 Thread Lijo Lazar
VF device sets the RAS flag when mailbox data can't be read properly. There is no conclusive way to tell if the real source is RAS error. Therefore VF schedules a KFD based reset which doesn't set RAS source. SKip checking RAS source for any VF scheduled recovery. Signed-off-by:

[PATCH] drm/amdgpu: Add handler for SDMA context empty

2025-01-01 Thread Lijo Lazar
Context empty interrupt is enabled for SDMA 4.4.2. Add a handler for context empty interrupt so that it is disposed of fast, and not propagated to KFD layer. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 1 + drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 22

[PATCH] drm/amdgpu: Clean up atom header file inclusion

2025-02-04 Thread Lijo Lazar
atom bios header files are not required in these files. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v12_0.c | 1 - drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c| 1

[PATCH 1/4] drm/amdgpu: Add wrapper for freeing vbios memory

2025-02-05 Thread Lijo Lazar
Use bios_release wrapper to release memory allocated for vbios image and reset the variables. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c

[PATCH 4/4] drm/amdgpu: Make VBIOS image read optional

2025-02-05 Thread Lijo Lazar
Keep VBIOS image read optional for select SOCs in passthrough mode. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index

[PATCH 2/4] drm/amdgpu: Add VBIOS flags

2025-02-05 Thread Lijo Lazar
Instead of read_bios, use get_bios_flags to get various options around reading VBIOS. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b

[PATCH 1/3] drm/amd/pm: Add APIs for device access checks

2025-02-03 Thread Lijo Lazar
Wrap the checks before device access in helper functions and use them for device access. The generic order of APIs now is to do input argument validation first and check if device access is allowed. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 616

[PATCH 2/3] drm/amd/pm: Fix get_if_active usage

2025-02-03 Thread Lijo Lazar
ctive. Hence no need to get() to increment usage count. Remove < 0 return value check. Also, ignore runpm state to determine active status. If the device is already in suspend state, disallow access. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 28 ++

[PATCH 3/3] drm/amd/pm: Remove unnecessary device state checks

2025-02-03 Thread Lijo Lazar
For amdgpu_get_pp_force_state, amdgpu_get_pp_cur_state already takes care of device state check. In other cases, values are returned from driver cached variables and are not dependent on device state. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 14 -- 1 file

[PATCH 3/4] drm/amdgpu: Add flag to make VBIOS read optional

2025-02-05 Thread Lijo Lazar
Certain SOCs may not need much data from VBIOS. Some data like VBIOS version used will be missed but it doesn't affect functionality. Add a flag to make VBIOS image optional. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 70 +-- .../gpu/dr

[PATCH v2 1/4] drm/amdgpu: Move xgmi definitions to xgmi header

2025-02-09 Thread Lijo Lazar
Move definitions related to xgmi to amdgpu_xgmi header Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 23 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 35 +--- 3 files changed, 34

[PATCH v2 4/4] drm/amdgpu: Use xgmi APIs for init and bandwidth

2025-02-09 Thread Lijo Lazar
Initialize xgmi related static information during early_init. Use xgmi API to get max bandwidth details. Signed-off-by: Lijo Lazar --- v2: Move XGMI info init to early init phase (Jon) drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 -- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3

[PATCH v2 3/4] drm/amdgpu: Remove unsupported xgmi versions

2025-02-09 Thread Lijo Lazar
XGMI v4.8.0 is not used in any SOCs. Remove the associated functions. Also, ensure get_xgmi_info callback pointer is not NULL before calling the function. Signed-off-by: Lijo Lazar --- v2: Remove XGMI v4.8.0 as it is unused (Hawking) drivers/gpu/drm/amd/amdgpu/amdgpu_device.c| 2

[PATCH v2 2/4] drm/amdgpu: Add xgmi speed/width related info

2025-02-09 Thread Lijo Lazar
Add APIs to initialize XGMI speed, width details and get to max bandwidth supported. It is assumed that a device only supports same generation of XGMI links with uniform width. Signed-off-by: Lijo Lazar --- v2: Use GC versions as XGMI version is not populated for all SOCs (Hawking

[PATCH v2 3/4] drm/amdgpu: Add flag to make VBIOS read optional

2025-02-05 Thread Lijo Lazar
Certain SOCs may not need much data from VBIOS. Some data like VBIOS version used will be missed but it doesn't affect functionality. Add a flag to make VBIOS image optional. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c

[PATCH v2 1/4] drm/amdgpu: Add wrapper for freeing vbios memory

2025-02-05 Thread Lijo Lazar
Use bios_release wrapper to release memory allocated for vbios image and reset the variables. v2: Use the same wrapper for clean up in sw_fini (Alex Deucher) Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd

[PATCH v2 2/4] drm/amdgpu: Add VBIOS flags

2025-02-05 Thread Lijo Lazar
Instead of read_bios, use get_bios_flags to get various options around reading VBIOS. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 20 +--- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/drm/amd

[PATCH v2 4/4] drm/amdgpu: Make VBIOS image read optional

2025-02-05 Thread Lijo Lazar
Keep VBIOS image read optional for select SOCs in passthrough mode. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu

[PATCH 2/4] drm/amdgpu: Add xgmi speed/width related info

2025-02-06 Thread Lijo Lazar
Add APIs to initialize XGMI speed, width details and get to max bandwidth supported. It is assumed that a device only supports same generation of XGMI links with uniform width. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 41 drivers/gpu/drm

[PATCH 1/4] drm/amdgpu: Move xgmi definitions to xgmi header

2025-02-06 Thread Lijo Lazar
Move definitions related to xgmi to amdgpu_xgmi header Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h | 23 +--- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 8 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.h | 35 +--- 3 files changed, 34

[PATCH 4/4] drm/amdgpu: Use xgmi APIs to get bandwidth

2025-02-06 Thread Lijo Lazar
Use xgmi API to get max bandwidth details. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c index

[PATCH 3/4] drm/amdgpu: Initialize xgmi info during discovery

2025-02-06 Thread Lijo Lazar
Initialize xgmi related static information during discovery. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 20 +-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers/gpu/drm/amd

[PATCH] drm/amd/pm: Fix get_if_active usage

2025-01-31 Thread Lijo Lazar
ctive. Hence no need to get() to increment usage count. Remove < 0 return value check. Signed-off-by: Lijo Lazar Fixes: 6e796cb4a972b ("drm/amd/pm: use pm_runtime_get_if_active for debugfs getters") --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 129 +++-- 1

[PATCH] drm/amd/pm: Limit to 8 jpeg rings per instance

2025-01-31 Thread Lijo Lazar
JPEG 5.0.1 supports upto 10 rings, however PMFW support for SMU v13.0.6 variants is now limited to 8 per instance. Limit to 8 temporarily to avoid out of bounds access. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 8 +--- 1 file changed, 5 insertions

[PATCH] drm/amdgpu: Use dbg level for VBIOS check messages

2024-12-11 Thread Lijo Lazar
Driver has different ways to fetch VBIOS. If one of the methods doesn't find an authentic one, it will show misleading info messages eventhough a subsequent method finds a valid VBIOS. Keep the message level at debug and add device context. Signed-off-by: Lijo Lazar --- drivers/gpu/dr

[PATCH] drm/amdgpu: Refine ip detection log message

2024-12-16 Thread Lijo Lazar
'add ip block' causes a confusion if the blocks are disabled later with ip_block_mask. Instead change to 'detected' and also add device context. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deleti

[PATCH] drm/amdgpu: Increase FRU File Id buffer size

2024-12-03 Thread Lijo Lazar
Some boards use longer File Ids. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_fru_eeprom.h index bc58dca18035

[PATCH v4] drm/amd/pm: Fix smu v13.0.6 caps initialization

2025-01-20 Thread Lijo Lazar
Fix the initialization and usage of SMU v13.0.6 capability values. Use caps_set/clear functions to set/clear capability. Also, fix SET_UCLK_MAX capability on APUs, it is supported on APUs. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher Reviewed-by: Yang Wang Fixes: 9bb53d2ce109 (&quo

[PATCH v3] drm/amd/pm: Fix smu v13.0.6 caps initialization

2025-01-20 Thread Lijo Lazar
Fix the initialization and usage of SMU v13.0.6 capability values. Use caps_set/clear functions to set/clear capability. Also, fix SET_UCLK_MAX capability on APUs, it is supported on APUs. Signed-off-by: Lijo Lazar Reviewed-by: Alex Deucher Fixes: 9bb53d2ce109 ("drm/amd/pm: Add capab

[PATCH v2] drm/amd/pm: Fix smu v13.0.6 caps initialization

2025-01-20 Thread Lijo Lazar
Fix the initialization and usage of SMU v13.0.6 capability values. Use caps_set/clear functions to set/clear capability. Also, fix SET_UCLK_MAX capability on APUs, it is supported on APUs. Signed-off-by: Lijo Lazar Fixes: 9bb53d2ce109 ("drm/amd/pm: Add capability flags for SMU v13.0.6&quo

[PATCH 2/2] drm/amdgpu: Use version to figure out harvest info

2025-01-27 Thread Lijo Lazar
IP tables with version <=2 may use harvest bit. For version 3 and above, harvest bit is not applicable, instead uses harvest table. Fix the logic accordingly. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 27 +++ 1 file changed, 16 inserti

[PATCH 1/2] drm/amdgpu: Pass IP instance/hwid as parameters

2025-01-27 Thread Lijo Lazar
Use IP instance number and hwid as function args for validation checks. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 42 --- 1 file changed, 28 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c b/drivers

[PATCH 2/2] drm/amdgpu: Clean up IP version checks in gmcv9.0

2025-01-27 Thread Lijo Lazar
Clean up some IP version checks in gmcv9.0 Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 49 ++- 1 file changed, 17 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index

[PATCH 1/2] drm/amdgpu: Clean up GFX v9.4.3 IP version checks

2025-01-27 Thread Lijo Lazar
Remove unnecessary IP version checks for GFX 9.4.3 and similar variants. Wrap checks inside meaningful function. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 68 ++-- drivers/gpu/drm/amd/amdgpu/gfxhub_v1_2.c | 22 2 files changed, 29

[PATCH] drm/amd/pm : Mark MM activity as unsupported

2025-01-21 Thread Lijo Lazar
Aldebaran doesn't support querying MM activity percentage. Keep the field as 0xFFs to mark it as unsupported. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/aldebaran_ppt

[PATCH] drm/amd/pm: Use one level table if dpm not enabled

2025-01-30 Thread Lijo Lazar
For SMU v13.0.6 variants, if dpm is disabled for a clock, fill current frequency as the only level in frequency table. Also, drop Lclk table as it is not used. Signed-off-by: Lijo Lazar --- .../drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 33 +++ 1 file changed, 19 insertions

[PATCH] drm/amd/pm: Add capability flags for SMU v13.0.6

2025-01-16 Thread Lijo Lazar
Add capability flags for SMU v13.0.6 variants. Initialize the flags based on firmware support. As there are multiple IP versions maintained, it is more manageable with one time initialization caps flags based on IP version and firmware feature support. Signed-off-by: Lijo Lazar --- drivers/gpu

[PATCH 1/3] drm/amdgpu: Add VCN v4.0.3 RRMT register offset

2025-01-10 Thread Lijo Lazar
Add RRMT control register offset for VCN v4.0.3 Signed-off-by: Lijo Lazar Reviewed-by: Sathishkumar S --- drivers/gpu/drm/amd/include/asic_reg/vcn/vcn_4_0_3_offset.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/include/asic_reg/vcn

[PATCH 3/3] drm/amdgpu: Check RRMT status for JPEG v4.0.3

2025-01-10 Thread Lijo Lazar
RRMT could get dynamically enabled/disabled by PSP firmware. Read the status from register for reading RRMT status. For VFs, this is not accessible, hence assume that it's always disabled for now. Signed-off-by: Lijo Lazar Reviewed-by: Sathishkumar S --- drivers/gpu/drm/amd/a

[PATCH 2/3] drm/amdgpu: Check RRMT status for VCN v4.0.3

2025-01-10 Thread Lijo Lazar
RRMT could get dynamically enabled/disabled by PSP firmware. Read the status from register for reading RRMT status. For VFs, this is not accessible, hence assume that it's always disabled for now. Signed-off-by: Lijo Lazar Reviewed-by: Sathishkumar S --- drivers/gpu/drm/amd/amdgpu/amdgpu_

[PATCH] drm/amd/pm: Use correct macros for smu caps

2025-01-17 Thread Lijo Lazar
Fix the initialization and usage of capability values and mask. SMU_CAPS_MASK indicates mask value, and SMU_CAPS represent the capability value. Signed-off-by: Lijo Lazar Fixes: 9bb53d2ce109 ("drm/amd/pm: Add capability flags for SMU v13.0.6") --- .../drm/amd/pm/swsmu/smu13/smu_v13

[PATCH] drm/amdgpu: Use firmware supported NPS modes

2025-02-13 Thread Lijo Lazar
If firmware supported NPS modes are available through CAP register, use those values for supported NPS modes. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 36 +++ 1 file changed, 26 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/amd

[PATCH] drm/amdgpu: Remove redundant logic in GC v9.4.3

2025-02-16 Thread Lijo Lazar
GFXOFF check is not need for GC v9.4.3. Also, save/restore list is available by default. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 17 + 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c b/drivers

[PATCH] drm/amdkfd: Use dev_* instead of pr_* for messages

2025-03-19 Thread Lijo Lazar
To get the device context, replace pr_ with dev_ functions. Signed-off-by: Lijo Lazar --- .../gpu/drm/amd/amdkfd/kfd_int_process_v10.c | 142 -- .../gpu/drm/amd/amdkfd/kfd_int_process_v11.c | 92 .../gpu/drm/amd/amdkfd/kfd_int_process_v9.c | 142

[PATCH] drm/amdgpu: Add basic validation for RAS header

2025-03-26 Thread Lijo Lazar
If RAS header read from EEPROM is corrupted, it could result in trying to allocate huge memory for reading the records. Add some validation to header fields. Signed-off-by: Lijo Lazar --- .../gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c| 22 --- 1 file changed, 19 insertions(+), 3

[PATCH] drm/amdgpu: Reset RAS table if header is invalid

2025-04-07 Thread Lijo Lazar
If a valid header is not found during RAS eeprom init, consider it as new and reset RAS table info. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras_eeprom.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amdgpu: Fix xgmi v6.4.1 link status reporting

2025-04-01 Thread Lijo Lazar
Use the right register offsets for getting link status. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c b/drivers/gpu/drm/amd/amdgpu

[PATCH 1/2] drm/amdgpu: Use generic hdp flush function

2025-04-11 Thread Lijo Lazar
Except HDP v5.2 all use a common logic for HDP flush. Use a generic function. HDP v5.2 forces NO_KIQ logic, revisit it later. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.c | 21 + drivers/gpu/drm/amd/amdgpu/amdgpu_hdp.h | 2 ++ drivers/gpu/drm/amd

<    1   2   3   4   5   >