Re: [PATCH] drm/amd/pm/powerplay/smumgr/vegam_smumgr: Fix error handling in vegam_populate_smc_boot_level()

2025-04-16 Thread Alex Deucher
On Tue, Apr 15, 2025 at 8:33 AM Wentao Liang wrote: > > In vegam_populate_smc_boot_level(), the return value of > phm_find_boot_level() is 0 or negative error code and the > "if (result)" branch statement will never run into the true > branch. Besides, this will skip setting the voltages later > b

[pull] amdgpu drm-fixes-6.15

2025-04-16 Thread Alex Deucher
Hi Dave, Simona, Fixes for 6.15. The following changes since commit 8ffd015db85fea3e15a77027fda6c02ced4d2444: Linux 6.15-rc2 (2025-04-13 11:54:49 -0700) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-fixes-6.15-2025-04-16 for you to fe

Re: [PATCH v3] drm/amd/display: Add error check for avi and vendor infoframe setup function

2025-04-16 Thread Alex Hung
Reviewed-by: Alex Hung On 4/13/25 21:14, Wentao Liang wrote: The function fill_stream_properties_from_drm_display_mode() calls the function drm_hdmi_avi_infoframe_from_display_mode() and the function drm_hdmi_vendor_infoframe_from_display_mode(), but does not check its return value. Log the err

[PATCH 6/7] drm/amdgpu/userq: rename eviction helpers

2025-04-16 Thread Alex Deucher
suspend/resume -> evict/restore Rename to avoid confusion with the system suspend and resume helpers. Signed-off-by: Alex Deucher --- .../gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c| 16 drivers/gpu/drm/amd/amdgpu/amdg

[PATCH 1/7] drm/amdgpu/userq: add a helper to check which IPs are enabled

2025-04-16 Thread Alex Deucher
Add a helper to get a mask of IPs which support user queues. Use this in the INFO IOCTL to get the IP mask to replace the current code. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 7 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 13 + dr

[PATCH 4/7] drm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini()

2025-04-16 Thread Alex Deucher
This was missed when the map and unmap were split out of the mqd create and destroy functions. Fixes: 5b1163621548 ("drm/amdgpu/userq: rework front end call sequence") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 4 +++- 1 file changed, 3 insertions(+), 1 delet

[PATCH 3/7] drm/amdgpu: switch from queue_active to queue state

2025-04-16 Thread Alex Deucher
Track the state of the queue rather than simple active vs not. This is needed for other states (hung, preempted, etc.). While we are at it, move the state tracking into the user queue front end code. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 101 +++

[PATCH 5/7] drm/amdgpu/userq: move waiting for last fence before umap

2025-04-16 Thread Alex Deucher
Need to wait for the last fence before unmapping. This also fixes a memory leak in amdgpu_userqueue_cleanup() when the fence isn't signalled. Fixes: 5b1163621548 ("drm/amdgpu/userq: rework front end call sequence") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |

[PATCH 7/7] drm/amdgpu/userq: use consistent function naming

2025-04-16 Thread Alex Deucher
s/userqueue/userq/ 1. remove the mix of amdgpu_userqueue and amdgpu_userq 2. to be consistent with other amdgpu_userq_fence.c 3. it's shorter Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/Makefile | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +- drivers/g

[PATCH 2/7] drm/amdgpu/userq: optimize enforce isolation and s/r

2025-04-16 Thread Alex Deucher
If user queues are disabled for all IPs in the case of suspend and resume and for gfx/compute in the case of enforce isolation, we can return early. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 18 ++ 1 file changed, 18 insertions(+) diff --git

[PATCH 1/2] drm/amdgpu/userq: unmap queues amdgpu_userq_mgr_fini()

2025-04-16 Thread Alex Deucher
This was missed when the map and unmap were split out of the mqd create and destroy functions. Fixes: 5b1163621548 ("drm/amdgpu/userq: rework front end call sequence") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 4 +++- 1 file changed, 3 insertions(+), 1 delet

[PATCH 2/2] drm/amdgpu/userq: move waiting for last fence before umap

2025-04-16 Thread Alex Deucher
Need to wait for the last fence before unmapping. This also fixes a memory leak in amdgpu_userqueue_cleanup() when the fence isn't signalled. Fixes: 5b1163621548 ("drm/amdgpu/userq: rework front end call sequence") Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c |

Re: [PATCH][next] drm/amd/pm: Avoid multiple -Wflex-array-member-not-at-end warnings

2025-04-16 Thread Alex Deucher
Can you resend, I can't seem to find the original emails. Additionally, all of the NISLANDS structures are unused in amdgpu, so those could be removed. Alex On Wed, Apr 16, 2025 at 12:48 AM Gustavo A. R. Silva wrote: > > Hi all, > > Friendly ping (second one): who can take this patch, please? 🙂 >

Re: [PATCH v4 1/5] drm: add macro drm_file_err to print process info

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 7:55 PM, Jani Nikula wrote: On Wed, 16 Apr 2025, Sunil Khatri wrote: Add a drm helper macro which append the process information for the drm_file over drm_err. Signed-off-by: Sunil Khatri --- include/drm/drm_file.h | 41 + 1 file changed,

Re: 回复: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-16 Thread Alex Deucher
On Wed, Apr 16, 2025 at 9:48 AM Alexey Klimov wrote: > > On Wed Apr 16, 2025 at 4:12 AM BST, Fugang Duan wrote: > > 发件人: Alexey Klimov 发送时间: 2025年4月16日 2:28 > >>#regzbot introduced: v6.12..v6.13 > > [..] > > >>The only change related to hdp_v5_0_flush_hdp() was > >>cf424020e040 drm/amdgpu/hdp5.0:

RE: [PATCH 4/4] drm/amdgpu: free the evf when the attached bo release

2025-04-16 Thread Liang, Prike
[Public] > From: Koenig, Christian > Sent: Wednesday, April 16, 2025 7:07 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH 4/4] drm/amdgpu: free the evf when the attached bo release > > Am 16.04.25 um 10:50 schrieb Prike Liang: > > Free the evf

Re: [PATCH v4 1/5] drm: add macro drm_file_err to print process info

2025-04-16 Thread Jani Nikula
On Wed, 16 Apr 2025, Sunil Khatri wrote: > Add a drm helper macro which append the process information for > the drm_file over drm_err. > > Signed-off-by: Sunil Khatri > --- > include/drm/drm_file.h | 41 + > 1 file changed, 41 insertions(+) > > diff --git

RE: [PATCH 3/4] drm/amdgpu: trace the scheduler dependent job fence name

2025-04-16 Thread Liang, Prike
[Public] > From: Koenig, Christian > Sent: Wednesday, April 16, 2025 7:04 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH 3/4] drm/amdgpu: trace the scheduler dependent job fence > name > > Am 16.04.25 um 10:50 schrieb Prike Liang: > > This tr

[PATCH v5 5/5] drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver

2025-04-16 Thread Sunil Khatri
update the functions in amdgpu_userqueues.c from DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance can be logged in. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/

[PATCH v5 3/5] drm/amdgpu: use drm_file_err to add process info

2025-04-16 Thread Sunil Khatri
add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. drm_file_err logs pid and process name by default. Sample log: [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed out waiting for fence f=1c74d97

[PATCH v5 2/5] drm/amdgpu: add drm_file reference in userq_mgr

2025-04-16 Thread Sunil Khatri
drm_file will be used in usermode queues code to enable better process information in logging and hence add drm_file part of the userq_mgr struct. update the drm_file pointer in userq_mgr for each amdgpu_driver_open_kms. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c

[PATCH v5 1/5] drm: add macro drm_file_err to print process info

2025-04-16 Thread Sunil Khatri
Add a drm helper macro which append the process information for the drm_file over drm_err. Signed-off-by: Sunil Khatri --- include/drm/drm_file.h | 38 ++ 1 file changed, 38 insertions(+) diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index 94d3

[PATCH v5 4/5] drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userqueue.c

2025-04-16 Thread Sunil Khatri
change the DRM_ERROR to drm_file_err to ad process name and pid to the logging. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 52 +++ 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b

Re: [PATCH 6/6] drm/amdgpu: fix fence fallback timer expired error

2025-04-16 Thread Christian König
Am 14.04.25 um 12:46 schrieb Samuel Zhang: > IH is not working after switching a new gpu index for the first time. > IH handler function need to be re-registered with kernel after switching > to new gpu index. Why? Christian. > > Signed-off-by: Samuel Zhang > Change-Id: Idece1c8fce24032fd08f5a8

Re: [PATCH 5/6] drm/amdgpu: fix sdma ring test fail when resume from hibernation

2025-04-16 Thread Christian König
Am 14.04.25 um 12:46 schrieb Samuel Zhang: > gart tlb may be staled when switch to a new gpu index. this cause gpu > fetchs wrong data from gtt memory. Flush gart tlb at the end of gmc > resume to fix it. Well that's complete nonsense. When the TLB contains entries after a resume than that is a m

Re: [PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-04-16 Thread Christian König
Am 14.04.25 um 12:46 schrieb Samuel Zhang: > When switching to new GPU index after hibernation and then resume, > VRAM offset of each VRAM BO will be changed, and the cached gpu > addresses needed to updated. > > This is to enable pdb0 and switch to use pdb0-based virtual gpu > address by default i

[PATCH RFC] drm/amdgpu: Block userspace mapping of IO

2025-04-16 Thread Ujwal Kundur
This is a RFC patch for blocking userspace mapping of IO register(s) before ioremap() calls are made. Out of the available IRQ sources, CRTC seemed the most appropriate for this task, however I'm not quite sure about that as well as the type, which I've set to 0. If I understand correctly, we actu

[PATCH] drm/amd/include: fix kernel-doc formatting in amd_shared.h

2025-04-16 Thread Luke Hofstetter
when doing make htmldocs, Sphinx complained about in-line documentation in enum DC_DEBUG_MASK, so reformatted documentation to define each member in kernel-doc comment above the enum instead. Signed-off-by: Luke Hofstetter --- drivers/gpu/drm/amd/include/amd_shared.h | 124 ++

Re: [PATCH RFC] drm/amdgpu: Block userspace mapping of IO

2025-04-16 Thread Ujwal Kundur
Thanks for your response. > Hui what? Why do you think that grabbing a reference to an interrupt would > block userspace mapping of IO registers? It looks like I am missing a lot of pieces to do this, I'll try again once I have a better understanding. Sorry about that and thanks again for your

Re: 回复: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-16 Thread Alexey Klimov
On Wed Apr 16, 2025 at 4:12 AM BST, Fugang Duan wrote: > 发件人: Alexey Klimov 发送时间: 2025年4月16日 2:28 >>#regzbot introduced: v6.12..v6.13 [..] >>The only change related to hdp_v5_0_flush_hdp() was >>cf424020e040 drm/amdgpu/hdp5.0: do a posting read when flushing HDP >> >>Reverting that commit ^^ did

Re: [PATCH 1/6] drm/amdgpu: update XGMI physical node id and GMC configs on resume

2025-04-16 Thread Christian König
Am 14.04.25 um 12:46 schrieb Samuel Zhang: > For virtual machine with vGPUs in SRIOV single device mode and XGMI > is enabled, XGMI physical node ids may change when waking up from > hiberation with different vGPU devices. So update XGMI physical node > ids on resume. > > Update GPU memory controll

[PATCH v4 4/5] drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userqueue.c

2025-04-16 Thread Sunil Khatri
change the DRM_ERROR to drm_file_err to ad process name and pid to the logging. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 52 +++ 1 file changed, 29 insertions(+), 23 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c b

[PATCH v4 5/5] drm/amdgpu: change DRM_DBG_DRIVER to drm_dbg_driver

2025-04-16 Thread Sunil Khatri
update the functions in amdgpu_userqueues.c from DRM_DBG_DRIVER to drm_dbg_driver so multi gpu instance can be logged in. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/a

[PATCH v4 3/5] drm/amdgpu: use drm_file_err to add process info

2025-04-16 Thread Sunil Khatri
add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. drm_file_err logs pid and process name by default. Sample log: [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed out waiting for fence f=1c74d97

[PATCH v4 2/5] drm/amdgpu: add drm_file reference in userq_mgr

2025-04-16 Thread Sunil Khatri
drm_file will be used in usermode queues code to enable better process information in logging and hence add drm_file part of the userq_mgr struct. update the drm_file pointer in userq_mgr for each amdgpu_driver_open_kms. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c

[PATCH v4 1/5] drm: add macro drm_file_err to print process info

2025-04-16 Thread Sunil Khatri
Add a drm helper macro which append the process information for the drm_file over drm_err. Signed-off-by: Sunil Khatri --- include/drm/drm_file.h | 41 + 1 file changed, 41 insertions(+) diff --git a/include/drm/drm_file.h b/include/drm/drm_file.h index 9

Re: [PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump

2025-04-16 Thread Christian König
Am 16.04.25 um 14:54 schrieb Liang, Prike: > [Public] > >> From: Koenig, Christian >> Sent: Wednesday, April 16, 2025 7:01 PM >> To: Liang, Prike ; amd-gfx@lists.freedesktop.org >> Cc: Deucher, Alexander >> Subject: Re: [PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump >> >> Am 16.04

RE: [PATCH 2/4] drm/amdgpu: set the evf name to identify the userq case

2025-04-16 Thread Liang, Prike
[Public] > From: Koenig, Christian > Sent: Wednesday, April 16, 2025 7:02 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH 2/4] drm/amdgpu: set the evf name to identify the userq > case > > Am 16.04.25 um 10:50 schrieb Prike Liang: > > The evf

RE: [PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump

2025-04-16 Thread Liang, Prike
[Public] > From: Koenig, Christian > Sent: Wednesday, April 16, 2025 7:01 PM > To: Liang, Prike ; amd-gfx@lists.freedesktop.org > Cc: Deucher, Alexander > Subject: Re: [PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump > > Am 16.04.25 um 10:50 schrieb Prike Liang: > > This debug dump

Re: [PATCH 3/3] drm/amdgpu: Optimize DMABuf attachment with XGMI

2025-04-16 Thread Christian König
Am 16.04.25 um 06:45 schrieb Felix Kuehling: > When peer memory is accessed through XGMI, it does not need to be visible > in the BAR and there is no need for SG-tables or DMA mappings. > > Signed-off-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 16 +++- >

Re: [PATCH 2/3] drm/amdgpu: Allow P2P access through XGMI

2025-04-16 Thread Christian König
Am 16.04.25 um 06:45 schrieb Felix Kuehling: > If peer memory is accessible through XGMI, allow leaving it in VRAM > rather than forcing its migration to GTT on DMABuf attachment. > > Signed-off-by: Felix Kuehling > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 31 - > 1

Re: [PATCH 1/3] drm/amdgpu: Don't pin VRAM without DMABUF_MOVE_NOTIFY

2025-04-16 Thread Christian König
Am 16.04.25 um 06:45 schrieb Felix Kuehling: > Pinning of VRAM is for peer devices that don't support dynamic attachment > and move notifiers. But it requires that all such peer devices are able to > access VRAM via PCIe P2P. Any device without P2P access requires migration > to GTT, which fails if

Re: [PATCH RFC] drm/amdgpu: Block userspace mapping of IO

2025-04-16 Thread Christian König
Am 16.04.25 um 13:43 schrieb Ujwal Kundur: > Thanks for your response. > >> Hui what? Why do you think that grabbing a reference to an interrupt would >> block userspace mapping of IO registers? > It looks like I am missing a lot of pieces to do this, I'll try again > once I have a better understa

Re: [PATCH v3 3/4] drm/amdgpu: use drm_file_err in logging to also dump process information

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 5:37 PM, Pierre-Eric Pelloux-Prayer wrote: Hi, Le 16/04/2025 à 12:01, Khatri, Sunil a écrit : On 4/16/2025 12:56 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: add process and pid information in the userqueue error logging to make it more useful in resolvi

Re: [PATCH v3 3/4] drm/amdgpu: use drm_file_err in logging to also dump process information

2025-04-16 Thread Pierre-Eric Pelloux-Prayer
Hi, Le 16/04/2025 à 12:01, Khatri, Sunil a écrit : On 4/16/2025 12:56 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. Sample log: [   42.444297] [drm:amdg

Re: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-16 Thread Christian König
Am 15.04.25 um 20:28 schrieb Alexey Klimov: > #regzbot introduced: v6.12..v6.13 > > I use RX6600 on arm64 Orion o6 board and it seems that amdgpu is broken on > recent kernels, fails on boot: Well in general we already had tons of problems with low end ARM64 boards. So first question of all is t

Re: [PATCH v3 1/4] drm: add function drm_file_err to print proc information too

2025-04-16 Thread Christian König
Am 16.04.25 um 10:39 schrieb Khatri, Sunil: > > On 4/16/2025 12:37 PM, Tvrtko Ursulin wrote: >> >> On 15/04/2025 19:43, Sunil Khatri wrote: >>> [SNIP] >>> + >> >> I was hoping something primitive could be enough. With no temporary stack >> space required. Primitive on the level of (but simplified

Re: [PATCH 4/4] drm/amdgpu: free the evf when the attached bo release

2025-04-16 Thread Christian König
Am 16.04.25 um 10:50 schrieb Prike Liang: > Free the evf when the attached bo released. The evf still > be dependent on and referred to by the attached bo that is > scheduled by the kernel queue SDMA or gfx after the evf signalled. > > Signed-off-by: Prike Liang > --- > .../drm/amd/amdgpu/amdgpu_

Re: [PATCH 3/4] drm/amdgpu: trace the scheduler dependent job fence name

2025-04-16 Thread Christian König
Am 16.04.25 um 10:50 schrieb Prike Liang: > This trace will help in tracking the scheduler dependent > job fence. Changes for general DRM code need to got o the appropriate mailing list. Apart from that IIRC we intentionally didn't do that. Why should the driver name be relevant here? Regards,

Re: [PATCH 2/4] drm/amdgpu: set the evf name to identify the userq case

2025-04-16 Thread Christian König
Am 16.04.25 um 10:50 schrieb Prike Liang: > The evf fence name can clearly identify the userq usage. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu

Re: [PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump

2025-04-16 Thread Christian König
Am 16.04.25 um 10:50 schrieb Prike Liang: > This debug dump will help on debugging the evf attached gem obj fence > related issue. That looks like overkill to me and will just massively spam the debug log. Christian. > > Signed-off-by: Prike Liang > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_evic

Re: [PATCH 0/6] enable switching to new gpu index for hibernate on SRIOV.

2025-04-16 Thread Zhang, GuoQing (Sam)
[AMD Official Use Only - AMD Internal Distribution Only] Ping… Regards Sam From: Samuel Zhang Date: Monday, April 14, 2025 at 18:47 To: amd-gfx@lists.freedesktop.org Cc: Zhao, Victor , Chang, HaiJun , Deng, Emily , Zhang, GuoQing (Sam) Subject: [PATCH 0/6] enable switching to new gpu index f

Re: [PATCH v3 3/4] drm/amdgpu: use drm_file_err in logging to also dump process information

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 12:56 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. Sample log: [   42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Time

RE: [PATCH v2] drm/amdgpu: Disallow partition query during reset

2025-04-16 Thread Kamal, Asad
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Asad Kamal Thanks & Regards Asad -Original Message- From: Lazar, Lijo Sent: Wednesday, April 16, 2025 1:42 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Kamal, Asad Subject: [PATCH

Re: [PATCH v2 00/12] Generate CPER records for RAS and commit to CPER ring

2025-04-16 Thread Aravind Iddamsetty
Hi, I would appreciate it if you could kindly let me know your thoughts. Thanks, Aravind. On 28-03-2025 17:42, Aravind Iddamsetty wrote: > ++ dri-devel > > On 28-03-2025 15:57, Aravind Iddamsetty wrote: >> Hi, >> >> Based on the discussions around using Netlink for RAS purposes, as >> summarized

[PATCH 3/4] drm/amdgpu: trace the scheduler dependent job fence name

2025-04-16 Thread Prike Liang
This trace will help in tracking the scheduler dependent job fence. Signed-off-by: Prike Liang --- drivers/gpu/drm/scheduler/gpu_scheduler_trace.h | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler_trace.h b/drivers/gpu/drm/scheduler/

[PATCH 4/4] drm/amdgpu: free the evf when the attached bo release

2025-04-16 Thread Prike Liang
Free the evf when the attached bo released. The evf still be dependent on and referred to by the attached bo that is scheduled by the kernel queue SDMA or gfx after the evf signalled. Signed-off-by: Prike Liang --- .../drm/amd/amdgpu/amdgpu_eviction_fence.c| 31 --- .../drm/a

[PATCH 2/4] drm/amdgpu: set the evf name to identify the userq case

2025-04-16 Thread Prike Liang
The evf fence name can clearly identify the userq usage. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_e

[PATCH 1/4] drm/amdgpu: add the evf attached gem obj resv dump

2025-04-16 Thread Prike Liang
This debug dump will help on debugging the evf attached gem obj fence related issue. Signed-off-by: Prike Liang --- drivers/gpu/drm/amd/amdgpu/amdgpu_eviction_fence.c | 13 + drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 6 +- 2 files changed, 18 insertions(+), 1 deletion

RE: [PATCH v2] drm/amdgpu: Disallow partition query during reset

2025-04-16 Thread Zhang, Hawking
[AMD Official Use Only - AMD Internal Distribution Only] Reviewed-by: Hawking Zhang Regards, Hawking -Original Message- From: Lazar, Lijo Sent: Wednesday, April 16, 2025 16:12 To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Kamal, Asad Subject: [PATCH v2] d

Re: [PATCH v3 2/4] drm/amdgpu: add drm_file reference in userq_mgr

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 12:59 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: drm_file will be used in usermode queues code to enable better process information in logging and hence add drm_file part of the userq_mgr struct. update the drm_file pointer in userq_mgr for each amdgpu_dr

Re: [PATCH v3 1/4] drm: add function drm_file_err to print proc information too

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 12:37 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: Add a drm helper function which get the process information for the drm_file and append the process information using the existing drm_err. Signed-off-by: Sunil Khatri ---   include/drm/drm_file.h | 40 +++

RE: [PATCH] drm/amdgpu: Disallow partition query during reset

2025-04-16 Thread Lazar, Lijo
[AMD Official Use Only - AMD Internal Distribution Only] Sending a v2. Please ignore this. Thanks, Lijo -Original Message- From: amd-gfx On Behalf Of Lijo Lazar Sent: Wednesday, April 16, 2025 1:35 PM To: amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Deucher, Alexander ; Kamal, Asa

[PATCH v2] drm/amdgpu: Disallow partition query during reset

2025-04-16 Thread Lijo Lazar
Reject queries to get current partition modes during reset. Also, don't accept sysfs interface requests to switch compute partition mode while in reset. Signed-off-by: Lijo Lazar --- v2: Keep consistent error code, return EPERM drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10 ++ drivers/gp

[PATCH] drm/amdgpu: Disallow partition query during reset

2025-04-16 Thread Lijo Lazar
Reject queries to get current partition modes during reset. Also, don't accept sysfs interface requests to switch compute partition mode while in reset. Signed-off-by: Lijo Lazar --- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c | 10 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c | 4 2 fil

Re: [PATCH RFC] drm/amdgpu: Block userspace mapping of IO

2025-04-16 Thread Christian König
Am 16.04.25 um 09:28 schrieb Ujwal Kundur: > This is a RFC patch for blocking userspace mapping of IO register(s) > before ioremap() calls are made. Out of the available IRQ sources, CRTC > seemed the most appropriate for this task, however I'm not quite sure > about that as well as the type, which

Re: [PATCH v3 2/4] drm/amdgpu: add drm_file reference in userq_mgr

2025-04-16 Thread Tvrtko Ursulin
On 15/04/2025 19:43, Sunil Khatri wrote: drm_file will be used in usermode queues code to enable better process information in logging and hence add drm_file part of the userq_mgr struct. update the drm_file pointer in userq_mgr for each amdgpu_driver_open_kms. Signed-off-by: Sunil Khatri --

Re: [PATCH v3 3/4] drm/amdgpu: use drm_file_err in logging to also dump process information

2025-04-16 Thread Tvrtko Ursulin
On 15/04/2025 19:43, Sunil Khatri wrote: add process and pid information in the userqueue error logging to make it more useful in resolving the error by logs. Sample log: [ 42.444297] [drm:amdgpu_userqueue_wait_for_signal [amdgpu]] *ERROR* Timed out waiting for fence f=1c74d978 for

[REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-16 Thread Alexey Klimov
#regzbot introduced: v6.12..v6.13 I use RX6600 on arm64 Orion o6 board and it seems that amdgpu is broken on recent kernels, fails on boot: [drm] amdgpu: 7886M of GTT memory ready. [drm] GART: num cpu pages 131072, num gpu pages 131072 SError Interrupt on CPU11, code 0xbe11 -- SErr

回复: [REGRESSION] amdgpu: async system error exception from hdp_v5_0_flush_hdp()

2025-04-16 Thread Fugang Duan
发件人: Alexey Klimov 发送时间: 2025年4月16日 2:28 >#regzbot introduced: v6.12..v6.13 > >I use RX6600 on arm64 Orion o6 board and it seems that amdgpu is broken on >recent >kernels, fails on boot: > >[drm] amdgpu: 7886M of GTT memory ready. >[drm] GART: num cpu pages 131072, num gpu pages 131072 SError Int

Re: [PATCH v3 21/54] dyndbg-test: change do_prints testpoint to accept a loopct

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:04 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > echo 1000 > /sys/module/test_dynamic_debug/parameters/do_prints > > > > This allows its use as a scriptable load generator, to generate > > dynamic-prefix-emits for flag combinations vs und

[PATCH 5.10.y] drm/amd/display: Stop amdgpu_dm initialize when link nums greater than max_links

2025-04-16 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit cf8b16857db702ceb8d52f9219a4613363e2b1cf ] [Why] Coverity report OVERRUN warning. There are only max_links elements within dc->links. link count could up to AMDGPU_DM_MAX_DISPLAY_INDEX 31. [How] Make sure link count less than max_links. Reviewed-by: Harry Went

Re: [PATCH v3 23/54] dyndbg: treat comma as a token separator

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:05 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > Treat comma as a token terminator, just like a space. This allows a > > user to avoid quoting hassles when spaces are otherwise needed: > > > > :#> modprobe drm dyndbg=class,DRM_UT_CORE,

Re: [PATCH v3 17/54] dyndbg-API: replace DECLARE_DYNDBG_CLASSMAP

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:01 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > DECLARE_DYNDBG_CLASSMAP() has a design error; its usage fails a basic > > K&R rule: "define once, refer many times". > > > > When DRM_USE_DYNAMIC_DEBUG=y, it is used across DRM core & drive

[PATCH 5.15.y] drm/amd/display: Stop amdgpu_dm initialize when link nums greater than max_links

2025-04-16 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit cf8b16857db702ceb8d52f9219a4613363e2b1cf ] [Why] Coverity report OVERRUN warning. There are only max_links elements within dc->links. link count could up to AMDGPU_DM_MAX_DISPLAY_INDEX 31. [How] Make sure link count less than max_links. Reviewed-by: Harry Went

[PATCH 6.6.y] drm/amd/display: Stop amdgpu_dm initialize when link nums greater than max_links

2025-04-16 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit cf8b16857db702ceb8d52f9219a4613363e2b1cf ] [Why] Coverity report OVERRUN warning. There are only max_links elements within dc->links. link count could up to AMDGPU_DM_MAX_DISPLAY_INDEX 31. [How] Make sure link count less than max_links. Reviewed-by: Harry Went

Re: [PATCH v3 20/54] dyndbg: check DYNAMIC_DEBUG_CLASSMAP_DEFINE args at compile-time

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:04 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > Add __DYNAMIC_DEBUG_CLASSMAP_CHECK to implement the following > > arg-checks at compile-time: > > > > 0 <= _base < 63 > > class_names is not empty > > class_names[0] is a

[PATCH 6.1.y] drm/amd/display: Stop amdgpu_dm initialize when link nums greater than max_links

2025-04-16 Thread jianqi.ren.cn
From: Hersen Wu [ Upstream commit cf8b16857db702ceb8d52f9219a4613363e2b1cf ] [Why] Coverity report OVERRUN warning. There are only max_links elements within dc->links. link count could up to AMDGPU_DM_MAX_DISPLAY_INDEX 31. [How] Make sure link count less than max_links. Reviewed-by: Harry Went

Re: [PATCH v3 26/54] dyndbg: change __dynamic_func_call_cls* macros into expressions

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:06 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > The Xe driver's XE_IOCTL_DBG macro calls drm_dbg() from inside an if > > (expression). This breaks when CONFIG_DRM_USE_DYNAMIC_DEBUG=y because > > the invoked macro has a do-while-0 wrappe

Re: [PATCH v3 18/54] selftests-dyndbg: add tools/testing/selftests/dynamic_debug/*

2025-04-16 Thread jim . cromie
On Tue, Apr 15, 2025 at 4:02 AM Louis Chauvet wrote: > > > > Le 02/04/2025 à 19:41, Jim Cromie a écrit : > > Add a selftest script for dynamic-debug. The config requires > > CONFIG_TEST_DYNAMIC_DEBUG=m and CONFIG_TEST_DYNAMIC_DEBUG_SUBMOD=m, > > which tacitly requires either CONFIG_DYNAMIC_DEBUG=

Re: [PATCH v3 4/4] drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userqueue.c

2025-04-16 Thread Khatri, Sunil
On 4/16/2025 12:48 PM, Tvrtko Ursulin wrote: On 15/04/2025 19:43, Sunil Khatri wrote: change the DRM_ERROR to drm_file_err which gives the drm device information too which is useful in case of multiple GPU's and also add process information. Signed-off-by: Sunil Khatri ---   drivers/gpu/drm

Re: [PATCH v3 4/4] drm/amdgpu: change DRM_ERROR to drm_file_err in amdgpu_userqueue.c

2025-04-16 Thread Tvrtko Ursulin
On 15/04/2025 19:43, Sunil Khatri wrote: change the DRM_ERROR to drm_file_err which gives the drm device information too which is useful in case of multiple GPU's and also add process information. Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 59 +++

Re: [PATCH v3 1/4] drm: add function drm_file_err to print proc information too

2025-04-16 Thread Tvrtko Ursulin
On 15/04/2025 19:43, Sunil Khatri wrote: Add a drm helper function which get the process information for the drm_file and append the process information using the existing drm_err. Signed-off-by: Sunil Khatri --- include/drm/drm_file.h | 40 1 file