[PATCH RESEND] drm/amd/display: adds kernel-doc comment for dc_stream_remove_writeback()

2025-04-30 Thread James Flowers
Adds a kernel-doc for externally linked dc_stream_remove_writeback() function. Signed-off-by: James Flowers --- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 8 1 file changed, 8 insertions(+) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/displ

[PATCH] drm/amdgpu/vcn4.0.5: Fix GFX10_ADDR_CONFIG programming for vcn1.

2025-04-30 Thread Ruijing Dong
The UVD_GFX10_ADDR_CONFIG's offset for vcn1 was programmed incorrectly, which causes the corrupted output from VCN1. This fixes the issue, copied from UVD_GFX10_ADDR_CONFIG programming from other VCN generations. Signed-off-by: Ruijing Dong --- drivers/gpu/drm/amd/amdgpu/vcn_v4_0_5.c | 2 +- 1

Re: [RFC PATCH 1/2] drm/amdgpu: amdgpu_vram_mgr_new(): Clamp lpfn to total vram

2025-04-30 Thread Paneer Selvam, Arunpravin
On 5/1/2025 2:50 AM, Alex Deucher wrote: + Christian On Tue, Apr 29, 2025 at 7:24 AM John Olender wrote: The drm_mm allocator tolerated being passed end > mm->size, but the drm_buddy allocator does not. Restore the pre-buddy-allocator behavior of allowing such placements. Closes: https://

Re: [RFC PATCH 2/2] drm/amdgpu/uvd: Ensure vcpu bos are within the uvd segment

2025-04-30 Thread Alex Deucher
+ Christian On Tue, Apr 29, 2025 at 7:25 AM John Olender wrote: > > If the vcpu bos are allocated outside the uvd segment, > amdgpu_uvd_ring_test_ib() times out waiting on the ring's fence. > > See amdgpu_fence_driver_start_ring() for more context. > > Closes: https://gitlab.freedesktop.org/drm/a

Re: [RFC PATCH 1/2] drm/amdgpu: amdgpu_vram_mgr_new(): Clamp lpfn to total vram

2025-04-30 Thread Alex Deucher
+ Christian On Tue, Apr 29, 2025 at 7:24 AM John Olender wrote: > > The drm_mm allocator tolerated being passed end > mm->size, but the > drm_buddy allocator does not. > > Restore the pre-buddy-allocator behavior of allowing such placements. > > Closes: https://gitlab.freedesktop.org/drm/amd/-/is

Re: [PATCH] drm/amdgpu/psp: mark securedisplay TA as optional

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 4:04 AM Michel Dänzer wrote: > > On 2025-04-29 15:47, Alex Deucher wrote: > > This is an optional TA which is only available on > > certain embedded systems. Mark it as optional to avoid > > user confusion. This mirrors what we already do for > > other optional TAs. > > >

Re: 回复: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-30 Thread Alex Deucher
I think I have a better solution. Please try these patches instead. Thanks! For the RX6600, you only need patch 0003. The rest of the series fixes up other chips. Thanks, Alex On Sat, Apr 26, 2025 at 9:01 PM Alexey Klimov wrote: > > On Thu Apr 24, 2025 at 4:44 PM BST, Alex Deucher wrote: >

[PATCH v4 1/2] dma-fence: Add helper to sort and deduplicate dma_fence arrays

2025-04-30 Thread Arvind Yadav
Export a new helper function `dma_fence_dedup_array()` that sorts an array of dma_fence pointers by context, then deduplicates the array by retaining only the most recent fence per context. This utility is useful when merging or optimizing sets of fences where redundant entries from the same conte

[PATCH v4 2/2] drm/amdgpu: only keep most recent fence for each context

2025-04-30 Thread Arvind Yadav
Mesa passes shared bo, fence syncobj to userq_ioctl. There can be duplicates here or some fences that are old. This patch is remove duplicates fence and only keep the most recent fence for each context. v2: - Export this code from dma-fence-unwrap.c(by Christian). v3: - To split this in a dma_buf

[PATCH 2/7] drm/amdgpu/userq: add callback for reset

2025-04-30 Thread Alex Deucher
This is used to reset a queue. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h index 4d3eb651acf1a..24

[PATCH 3/7] drm/amdgpu: add mes userq reset callback

2025-04-30 Thread Alex Deucher
Used to reset a hung queue. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/mes_userqueue.c | 21 + 1 file changed, 21 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c b/drivers/gpu/drm/amd/amdgpu/mes_userqueue.c in

[PATCH 6/7] drm/amdgpu: add UAPI for user queue query status

2025-04-30 Thread Alex Deucher
Add an API to query queue status such as whether the queue is hung or whether vram is lost. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- include/uapi/drm/amdgpu_drm.h | 14 ++ 1 file changed, 14 insertions(+) diff --git a/include/uapi/drm/amdgpu_drm.h b/include/uapi/dr

[PATCH 1/7] drm/amdgpu: add user queue reset source

2025-04-30 Thread Alex Deucher
Track resets from user queues. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/dr

[PATCH 7/7] drm/amdgpu/userq: implement support for query status

2025-04-30 Thread Alex Deucher
Query the status of the user queue, currently whether the queue is hung and whether or not VRAM is lost. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 47 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_userq.h | 1 + 2 files ch

[PATCH 5/7] drm/amdgpu/userq: implement resets

2025-04-30 Thread Alex Deucher
If map or unmap fails, or a user fence times out, attempt to reset the queue. If that fails, schedule a GPU reset. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 8 ++ drivers/gpu/drm/

[PATCH 4/7] drm/amdgpu/userq: add force completion helpers

2025-04-30 Thread Alex Deucher
Add support for forcing completion of userq fences. This is needed for userq resets and asic resets so that we can set the error on the fence and force completion. Cc: Christian König Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_userq_fence.c | 42 +

Re: [PATCH] drm/amd/display: Rename program_timing function for better debugging

2025-04-30 Thread Alex Deucher
Applied. Thanks! On Wed, Apr 30, 2025 at 3:44 AM Antonio Fernando Silva e Cruz Filho wrote: > > [WHY] > Improve the output when using the ftrace debug feature, > making it easier to identify which function is currently being executed. > > [HOW] > Rename the program_timing function to a name that

Re: [PATCH] drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init()

2025-04-30 Thread Yadav, Arvind
I also thought for previous patch but else was doing that.  We can use something like this. just alternative solution. if (amdgpu_ip_version(adev, GC_HWIP, 0) < IP_VERSION(10, 0, 0) ||         (!AMDGPU_IS_MMHUB0(i) && !AMDGPU_IS_MMHUB1(i)))         id_mgr->num_ids = adev->vm_manager.first_kfd_vm

Re: [PATCH] Refine RAS bad page records counting and parsing in eeprom V3

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 12:13 AM ganglxie wrote: Please fix the patch title. Please add a drm/amdgpu prefix. E.g., drm/amdgpu: Refine RAS bad page records counting and parsing in eeprom V3 > > there is only MCA records in V3, no need to care about PA records. > recalculate the value of ras_n

[PATCH 2/5] drm/amd/display: hook up program_tg for dcn401

2025-04-30 Thread Melissa Wen
In this version, the global sync programming differs and needs an specific function call slightly different from the commonly used dcn20_program_tg. Hook up dcn401_program_tg only for dcn401. Signed-off-by: Melissa Wen --- .../gpu/drm/amd/display/dc/hwss/dcn401/dcn401_hwseq.c| 9 - .

[PATCH 3/5] drm/amd/display: remove duplicated program_front_end_for_ctx code

2025-04-30 Thread Melissa Wen
Add detect_pipe_changes hook to dcn20_program_front_end_for_ctx and hook the later to program_front_end_for_ctx in dcn401, then remove dcn401_program_front_end_for_ctx duplicated code. Signed-off-by: Melissa Wen --- .../amd/display/dc/hwss/dcn20/dcn20_hwseq.c | 13 +- .../amd/display/dc/hwss/

[PATCH 5/5] drm/amd/display: remove duplicated program_pipe code on dcn401

2025-04-30 Thread Melissa Wen
Reduce code duplication by reusing dcn20_program_pipe since both dcn401/dcn20_program_pipe now does the same thing and so its caller on dcn401. Signed-off-by: Melissa Wen --- .../amd/display/dc/hwss/dcn401/dcn401_hwseq.c | 126 -- .../amd/display/dc/hwss/dcn401/dcn401_hwseq.h |

[PATCH 1/5] drm/amd/display: add hook for program_tg

2025-04-30 Thread Melissa Wen
The only actual difference between dcn20_program_pipe and dcn401_program_pipe is the way they program global sync. Create a hook to enable hw-family function calls, so that we can reuse dcn20_program_pipe, avoid code duplication and prevent future partial fixes for the same portion of code. Signe

[PATCH 4/5] drm/amd/display: remove duplicated post_unlock_program_front_end code on dcn401

2025-04-30 Thread Melissa Wen
Enable hw_sequence_private funcs: enable plane for program_pipe and post_unlock_reset_opp for post_unlock_program_front_end (even if this is actually dcn20_post_unlock_reset_opp) to remove duplicated post_unlock_program_front_end code on dcn401. Signed-off-by: Melissa Wen --- .../amd/display/dc/

[PATCH 0/5] drm/amd/display: remove code duplication on dcn401

2025-04-30 Thread Melissa Wen
Hi, I've been examining dcn401 code to figure out what is causing a wrong cursor gamma on HDR issue reported in [1], and I found unnecessary code duplications during this inspection. I don't have the HW, so I'd appreciate if someone can validate this series (if it makes sense to you ofc). This se

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 4:13 AM Dan Carpenter wrote: > > The "ticket" pointer points to in the middle of the &exec struct so it > can't be NULL. Remove the check. > > Signed-off-by: Dan Carpenter Applied. Thanks! > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 2 +- > 1 file changed, 1 i

Re: [PATCH] drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init()

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 1:05 AM Yadav, Arvind wrote: > > Reviewed-by: Arvind Yadav > > On 4/29/2025 11:20 PM, Alex Deucher wrote: > > When kernel queues are disabled, all GC vmids are available > > for the scheduler. MM vmids are still managed by the driver > > so make all 16 available. > > > >

Re: [PATCH next] drm/amdgpu/userq: Call unreserve on error in amdgpu_userq_fence_read_wptr()

2025-04-30 Thread Alex Deucher
Applied. Thanks! On Wed, Apr 30, 2025 at 4:08 AM Dan Carpenter wrote: > > This error path should call amdgpu_bo_unreserve() before returning. > > Fixes: d8675102ba32 ("drm/amdgpu: add vm root BO lock before accessing the > vm") > Signed-off-by: Dan Carpenter > --- > drivers/gpu/drm/amd/amdgpu

[PATCH] drm/amd/display: Fix null check of pipe_ctx->plane_state for update_dchubp_dpp

2025-04-30 Thread Melissa Wen
Similar to commit 6a057072ddd1 ("drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipe") that addresses a null pointer dereference on dcn20_update_dchubp_dpp. This is the same function hooked for update_dchubp_dpp in dcn401, with the same issue. Fix possible null pointer d

Re: amdgpu: Reproducible soft lockups when playing games

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 3:55 AM Borislav Petkov wrote: > > + amdgpu folks > > On Tue, Apr 29, 2025 at 02:51:56PM +0200, Marcus Rückert wrote: > > Hardware: > > - ASUS ROG Swift OLED PG27AQDP @ 480 Hz > > - LG 27GL850-B @ 144Hz > > - XFX Mercury Radeon RX 9070 XT OC Gaming Edition with RGB, 16GB GD

Re: Display Port handling errors out when monitor is slow to wake up

2025-04-30 Thread Alex Deucher
On Wed, Apr 30, 2025 at 3:55 AM Borislav Petkov wrote: > > + amdgpu folks. > > On Tue, Apr 29, 2025 at 02:51:21PM +0200, Marcus Rückert wrote: > > Hardware: > > - ASUS ROG Swift OLED PG27AQDP > > - XFX Mercury Radeon RX 9070 XT OC Gaming Edition with RGB, 16GB GDDR6, > > HDMI, 3x DP RX-97TRGBBB9

[PATCH] drm/amdgpu: properly handle GC vs MM in amdgpu_vmid_mgr_init()

2025-04-30 Thread Alex Deucher
When kernel queues are disabled, all GC vmids are available for the scheduler. MM vmids are still managed by the driver so make all 16 available. Also fix gmc 10 vs 11 mix up in commit 1f61fc28b939 ("drm/amdgpu/mes: make more vmids available when disable_kq=1") v2: Properly handle pre-GC 10 har

Re: [PATCH v2 3/3] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-04-30 Thread Christian König
On 4/30/25 12:16, Samuel Zhang wrote: > When switching to new GPU index after hibernation and then resume, > VRAM offset of each VRAM BO will be changed, and the cached gpu > addresses needed to updated. > > This is to enable pdb0 and switch to use pdb0-based virtual gpu > address by default in am

Re: [PATCH AUTOSEL 6.14 33/39] drm/amdgpu: Allow P2P access through XGMI

2025-04-30 Thread Alex Deucher
On Tue, Apr 29, 2025 at 7:51 PM Sasha Levin wrote: > > From: Felix Kuehling > > [ Upstream commit a92741e72f91b904c1d8c3d409ed8dbe9c1f2b26 ] > > If peer memory is accessible through XGMI, allow leaving it in VRAM > rather than forcing its migration to GTT on DMABuf attachment. > > Signed-off-by:

Re: [PATCH AUTOSEL 6.12 32/37] drm/amdgpu: Allow P2P access through XGMI

2025-04-30 Thread Alex Deucher
On Tue, Apr 29, 2025 at 7:58 PM Sasha Levin wrote: > > From: Felix Kuehling > > [ Upstream commit a92741e72f91b904c1d8c3d409ed8dbe9c1f2b26 ] > > If peer memory is accessible through XGMI, allow leaving it in VRAM > rather than forcing its migration to GTT on DMABuf attachment. > > Signed-off-by:

Re: [PATCH AUTOSEL 6.6 18/21] drm/amdgpu: Allow P2P access through XGMI

2025-04-30 Thread Alex Deucher
On Tue, Apr 29, 2025 at 8:04 PM Sasha Levin wrote: > > From: Felix Kuehling > > [ Upstream commit a92741e72f91b904c1d8c3d409ed8dbe9c1f2b26 ] > > If peer memory is accessible through XGMI, allow leaving it in VRAM > rather than forcing its migration to GTT on DMABuf attachment. > > Signed-off-by:

Re: [PATCH AUTOSEL 6.1 07/10] drm/amdgpu: Allow P2P access through XGMI

2025-04-30 Thread Alex Deucher
On Tue, Apr 29, 2025 at 8:03 PM Sasha Levin wrote: > > From: Felix Kuehling > > [ Upstream commit a92741e72f91b904c1d8c3d409ed8dbe9c1f2b26 ] > > If peer memory is accessible through XGMI, allow leaving it in VRAM > rather than forcing its migration to GTT on DMABuf attachment. > > Signed-off-by:

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Sharma, Shashank
On 30/04/2025 14:35, Christian König wrote: On 4/30/25 11:28, Sharma, Shashank wrote: [AMD Official Use Only - AMD Internal Distribution Only] Hello Dan, *From:* Dan Carpenter *Sent:* Wednesday, April 30, 2025

Re: [PATCH v2 2/3] drm/amdgpu: update GPU addresses for SMU and PSP

2025-04-30 Thread Christian König
On 4/30/25 12:16, Samuel Zhang wrote: > add amdgpu_bo_fb_aper_addr() and update the cached GPU addresses to use > the FB aperture address for SMU and PSP. > > 2 reasons for this change: > 1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is GART > aperture address, it is not compati

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Sharma, Shashank
On 30/04/2025 11:49, Dan Carpenter wrote: On Wed, Apr 30, 2025 at 09:28:59AM +, Sharma, Shashank wrote: [AMD Official Use Only - AMD Internal Distribution Only] Hello Dan, From: Dan Carpenter Sent: Wednesday, April 30, 2025 10:05 AM To: Deucher, Alexander

Re: [PATCH v2 1/3] drm/amdgpu: update XGMI physical node id and GMC configs on resume

2025-04-30 Thread Christian König
On 4/30/25 12:16, Samuel Zhang wrote: > For virtual machine with vGPUs in SRIOV single device mode and XGMI > is enabled, XGMI physical node ids may change when waking up from > hiberation with different vGPU devices. So update XGMI physical node > ids on resume. > > Update GPU memory controller c

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Christian König
On 4/30/25 11:28, Sharma, Shashank wrote: > [AMD Official Use Only - AMD Internal Distribution Only] > > > Hello Dan, > > > *From:* Dan Carpenter > *Sent:* Wednesday, April 30, 2025 10:05 AM > *To:* Deucher, Alexand

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Christian König
On 4/30/25 10:05, Dan Carpenter wrote: > The "ticket" pointer points to in the middle of the &exec struct so it > can't be NULL. Remove the check. > > Signed-off-by: Dan Carpenter Reviewed-by: Christian König > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_userq.c | 2 +- > 1 file changed, 1 inse

Re: [PATCH v3 5/5] drm/amdgpu: lock the eviction fence before signaling it

2025-04-30 Thread Christian König
On 4/30/25 04:40, Prike Liang wrote: > Lock the eviction fence before trying to signal it. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c > b

Re: [PATCH v3 4/5] drm/amdgpu: validate the eviction fence before attaching/detaching

2025-04-30 Thread Christian König
On 4/30/25 04:40, Prike Liang wrote: > Before the user queue BOs resume workqueue is scheduled; > there's no valid eviction fence to attach the gem obj. > For this case, it doesn't need to attach/detach the eviction > fence. Also, it needs to unlock the bo first before returning > from the evict

Re: [PATCH v3 3/5] drm/amdgpu: fix the eviction fence dereference

2025-04-30 Thread Christian König
On 4/30/25 04:40, Prike Liang wrote: > The dma_resv_add_fence() already refers to the added fence. > So when attaching the evciton fence to the gem bo, it needn't > refer to it anymore. > > Signed-off-by: Prike Liang > Reviewed-by: Christian König This is a bug fix and as such should always com

Re: [PATCH v3 2/5] drm/amdgpu: don't sync the user queue eviction fence

2025-04-30 Thread Christian König
On 4/30/25 04:40, Prike Liang wrote: > Don't return and sync the user queue eviction fence; > otherwise, the eviction fence will be returned as a > dependent fence during VM update and refer to the fence > result in leakage. Please drop that patch, it shouldn't be needed any more after the changes

Re: [PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-04-30 Thread Zhang, GuoQing (Sam)
[AMD Official Use Only - AMD Internal Distribution Only] Hi @Koenig, Christian, Thank you for the feedback. I have revised the patch according to your suggestions and sent out the v2 patch list. Please help review. Thank you! mail titles of v2 patchlist: [PATCH

[PATCH v2 3/3] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-04-30 Thread Samuel Zhang
When switching to new GPU index after hibernation and then resume, VRAM offset of each VRAM BO will be changed, and the cached gpu addresses needed to updated. This is to enable pdb0 and switch to use pdb0-based virtual gpu address by default in amdgpu_bo_create_reserved(). since the virtual addre

[PATCH v2 2/3] drm/amdgpu: update GPU addresses for SMU and PSP

2025-04-30 Thread Samuel Zhang
add amdgpu_bo_fb_aper_addr() and update the cached GPU addresses to use the FB aperture address for SMU and PSP. 2 reasons for this change: 1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is GART aperture address, it is not compatible with SMU and PSP, it need to updated to use FB

[PATCH v2 1/3] drm/amdgpu: update XGMI physical node id and GMC configs on resume

2025-04-30 Thread Samuel Zhang
For virtual machine with vGPUs in SRIOV single device mode and XGMI is enabled, XGMI physical node ids may change when waking up from hiberation with different vGPU devices. So update XGMI physical node ids on resume. Update GPU memory controller configuration on resume if XGMI physical node ids a

[PATCH v2 0/3] enable switching to new gpu index for hibernate on SRIOV.

2025-04-30 Thread Samuel Zhang
On SRIOV and VM environment, customer may need to switch to new vGPU indexes after hibernate and then resume the VM. For GPUs with XGMI, `vram_start` will change in this case, the FB aperture gpu address of VRAM BOs will also change. These gpu addresses need to be updated when resume. But these ad

Re: [PATCH] drm/amdgpu/userq: remove unnecessary NULL check

2025-04-30 Thread Sharma, Shashank
[AMD Official Use Only - AMD Internal Distribution Only] Hello Dan, From: Dan Carpenter Sent: Wednesday, April 30, 2025 10:05 AM To: Deucher, Alexander Cc: Koenig, Christian; David Airlie; Simona Vetter; Sharma, Shashank; Khatri, Sunil; Yadav, Arvind; Paneer Selv

RE: [PATCH 7/7] drm/amdgpu: set vram type for GC 9.5.0

2025-04-30 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Hit send too quickly. please drop the vendor since it is not needed. With that fixed. The patch is Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Zhang, Hawking Sent: Wednesday, April

RE: [PATCH 7/7] drm/amdgpu: set vram type for GC 9.5.0

2025-04-30 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] "vendor" is not needed at this stage. Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: Wednesday, April 30, 2025 16:26 To: amd-gfx@lists.freedesktop.org Cc: Zhou1, Tao Subject: [PATCH 7/7] drm/amdgpu:

RE: [PATCH] drm/amdgpu: handle old RAS eeprom data in non-nps1 mode

2025-04-30 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: amd-gfx On Behalf Of Tao Zhou Sent: Wednesday, April 30, 2025 16:39 To: amd-gfx@lists.freedesktop.org Cc: Zhou1, Tao Subject: [PATCH] drm/amdgpu: handle old RAS

[PATCH] drm/amdgpu: handle old RAS eeprom data in non-nps1 mode

2025-04-30 Thread Tao Zhou
Get MCA address from PA in nps1, then convert MCA address to PA in specific nps mode. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 16 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_umc.c | 22 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 2 ++ 3

[PATCH 7/7] drm/amdgpu: set vram type for GC 9.5.0

2025-04-30 Thread Tao Zhou
Set vram type and vendor so we can take different actions per the type. Signed-off-by: Tao Zhou --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 464015fc2012..a

[PATCH 6/7] drm/amdgpu: set flip bits for RAS bad pages

2025-04-30 Thread Tao Zhou
Make the code more general, user doesn't need to pay attention to the detail of flip bits setting. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/

[PATCH 4/7] drm/amdgpu: implement get_retire_flip_bits for UMC v12

2025-04-30 Thread Tao Zhou
The RAS bad page retire flip bits can be set per vram type, vram vendor and NPS mode. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 1 - drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 82 +- drivers/gpu/drm/amd/amdgpu/umc_v12_

[PATCH 3/7] drm/amdgpu: add get_retire_flip_bits for UMC

2025-04-30 Thread Tao Zhou
Add the general interface to get flip bits for RAS bad page retirement. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 4 drivers/gpu/drm/amd/amdgpu/amdgpu_umc.h | 15 +++ 2 files changed, 19 insertions(+) diff --git a/drivers

[PATCH 5/7] drm/amdgu: get RAS retire flip bits for new type of HBM

2025-04-30 Thread Tao Zhou
Get RAS retire flip bits for HBM with different types and vendors in various NPS modes. Also set flip row bit and MCA R13 bit in PA in different NPS modes. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 47 -- drivers/gpu/

[PATCH 1/7] drm/amd: add definition for new memory type

2025-04-30 Thread Tao Zhou
Support new version of HBM. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c | 3 +++ drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 3 ++- drivers/gpu/drm/amd/include/atomfirmware.h | 1 + include/uapi/drm/amdgpu_drm.h

[PATCH 2/7] drm/amdgpu: adjust high bits for RAS retired page

2025-04-30 Thread Tao Zhou
Per UMC address conversion algorithm, the high row bits of UMC MCA address are changed when they're converted into normalized address on specific ASICs. Signed-off-by: Tao Zhou Reviewed-by: Hawking Zhang --- drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 13 +++-- 1 file changed, 11 insertion

Re: [PATCH] drm/amdgpu/psp: mark securedisplay TA as optional

2025-04-30 Thread Michel Dänzer
On 2025-04-29 15:47, Alex Deucher wrote: > This is an optional TA which is only available on > certain embedded systems. Mark it as optional to avoid > user confusion. This mirrors what we already do for > other optional TAs. > > Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4181 > Sig

[PATCH 5.10 229/286] drm/amdgpu/dma_buf: fix page_link check

2025-04-30 Thread Greg Kroah-Hartman
5.10-stable review patch. If anyone has any objections, please let me know. -- From: Matthew Auld [ Upstream commit c0dd8a9253fadfb8e5357217d085f1989da4ef0a ] The page_link lower bits of the first sg could contain something like SG_END, if we are mapping a single VRAM page or

[PATCH v2] drm/amdgpu: Remove redundant return value checks for amdgpu_ras_error_data_init

2025-04-30 Thread Wentao Liang
The function amdgpu_ras_error_data_init() always returns 0, making its return value checks redundant. This patch changes its return type to void and removes all unnecessary checks in the callers. This simplifies the code and avoids confusion about the function's behavior. Additionally, this change

[PATCH 5.15 183/373] drm/amdgpu/dma_buf: fix page_link check

2025-04-30 Thread Greg Kroah-Hartman
5.15-stable review patch. If anyone has any objections, please let me know. -- From: Matthew Auld commit c0dd8a9253fadfb8e5357217d085f1989da4ef0a upstream. The page_link lower bits of the first sg could contain something like SG_END, if we are mapping a single VRAM page or con

[PATCH v2] drm/amdkfd: enable kfd on RISCV systems

2025-04-30 Thread liu.song13
From: Xuemei Liu KFD has been confirmed that can run on RISCV systems. It's necessary to support CONFIG_HSA_AMD on RISCV. Signed-off-by: Xuemei Liu --- drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/d

Re: Display Port handling errors out when monitor is slow to wake up

2025-04-30 Thread Borislav Petkov
+ amdgpu folks. On Tue, Apr 29, 2025 at 02:51:21PM +0200, Marcus Rückert wrote: > Hardware: > - ASUS ROG Swift OLED PG27AQDP > - XFX Mercury Radeon RX 9070 XT OC Gaming Edition with RGB, 16GB GDDR6, HDMI, > 3x DP RX-97TRGBBB9 > > Kernel: > - kernel-default-6.15~rc4-1.1.g62ec7c7.x86_64 from >

[PATCH 5.10 227/286] drm/amd/amdgpu/amdgpu_vram_mgr: Add missing descriptions for dev and dir

2025-04-30 Thread Greg Kroah-Hartman
5.10-stable review patch. If anyone has any objections, please let me know. -- From: Lee Jones [ Upstream commit 2c8645b7a6974b33744b677e9ddc89650776af46 ] Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c:648: warning: Function pa

amdgpu: Reproducible soft lockups when playing games

2025-04-30 Thread Borislav Petkov
+ amdgpu folks On Tue, Apr 29, 2025 at 02:51:56PM +0200, Marcus Rückert wrote: > Hardware: > - ASUS ROG Swift OLED PG27AQDP @ 480 Hz > - LG 27GL850-B @ 144Hz > - XFX Mercury Radeon RX 9070 XT OC Gaming Edition with RGB, 16GB GDDR6, HDMI, > 3x DP RX-97TRGBBB9 > - Ryzen 9 9950X3D on ASUS ProArt X8

[PATCH] drm/amd/display: Rename program_timing function for better debugging

2025-04-30 Thread Antonio Fernando Silva e Cruz Filho
[WHY] Improve the output when using the ftrace debug feature, making it easier to identify which function is currently being executed. [HOW] Rename the program_timing function to a name that includes the path to the function's file. Signed-off-by: Antonio Fernando Silva e Cruz Filho Co-developed

Re: [PATCH] drm/amdkfd: enable kfd on RISCV systems

2025-04-30 Thread liu.song13
>> From: Xuemei Liu >> >> KFD has been confirmed that can run on RISCV systems. It's necessary to >> support CONFIG_HSA_AMD on RISCV. > > Is there a public user mode branch with any changes needed to make ROCm > user mode work with RISCV? > > One more question inline. > > >> Signed-off-by: Xu

RE: [PATCH] Refine RAS bad page records counting and parsing in eeprom V3

2025-04-30 Thread Zhou1, Tao
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Tao Zhou > -Original Message- > From: Xie, Patrick > Sent: Wednesday, April 30, 2025 12:05 PM > To: amd-gfx@lists.freedesktop.org > Cc: Zhou1, Tao ; Xie, Patrick > Subject: [PATCH] Refine RAS bad page records countin