RE: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-04-14 Thread Shankar, Uma
> -Original Message- > From: Simon Ser > Sent: Tuesday, April 15, 2025 11:47 AM > To: Shankar, Uma > Cc: Alex Hung ; dri-de...@lists.freedesktop.org; amd- > g...@lists.freedesktop.org; intel-...@lists.freedesktop.org; wayland- > de...@lists.freedesktop.org; harry.wentl...@amd.com; leo..

RE: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-04-14 Thread Simon Ser
On Tuesday, April 15th, 2025 at 08:09, Shankar, Uma wrote: > We want to have just one change in the way we expose the hardware > capabilities else > all looks good in general. I would really recommend leaving this as a follow-up extension. It's a complicated addition that requires more discuss

RE: [PATCH V8 32/43] drm/colorop: Add 1D Curve Custom LUT type

2025-04-14 Thread Shankar, Uma
> -Original Message- > From: Alex Hung > Sent: Thursday, March 27, 2025 5:17 AM > To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org > Cc: wayland-de...@lists.freedesktop.org; harry.wentl...@amd.com; > alex.h...@amd.com; leo@amd.com; ville.syrj...@linux.intel.com; >

Re: [PATCH] drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb

2025-04-14 Thread Rodrigo Siqueira
On 04/14, Mario Limonciello wrote: > On 4/12/2025 3:37 PM, Rodrigo Siqueira wrote: > > Add some random documentation associated with the ring buffer > > manipulations and writeback. > > > > Signed-off-by: Rodrigo Siqueira > > --- > > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 29 +++

RE: [PATCH v2] drm/amdgpu: Clear overflow for SRIOV

2025-04-14 Thread Deng, Emily
[AMD Official Use Only - AMD Internal Distribution Only] Ping.. Emily Deng Best Wishes >-Original Message- >From: Emily Deng >Sent: Tuesday, April 15, 2025 9:33 AM >To: amd-gfx@lists.freedesktop.org >Cc: Deng, Emily >Subject: [PATCH v2] drm/amdgpu: Clear overflow for SRIOV > >For

[PATCH 2/2] drm/amd/pm: Enable host limit metrics support

2025-04-14 Thread Asad Kamal
Enable host limit metrics support for smuv_13_0_12 Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsm

[PATCH 1/2] drm/amd/pm: Enable host limit metrics support

2025-04-14 Thread Asad Kamal
Enable host limit metrics support for smuv_13_0_6 Signed-off-by: Asad Kamal Reviewed-by: Lijo Lazar --- drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c b/drivers/gpu/drm/amd/pm/swsm

[PATCH 2/2] drm/amdkfd: fix a bug of smi event for superuser

2025-04-14 Thread Eric Huang
rocm-smi with superuser permission doesn't show some of smi events, i.e. page fault/migration, because the condition of "(events & all)" is false. Superuser should be able to detect all events, the condiiton of "(events & all)" seems redundant, so removing it will fix the issue. Signed-off-by: Eri

[PATCH 1/2] drm/amdgpu: Test for imported buffers with drm_gem_is_imported()

2025-04-14 Thread Thomas Zimmermann
Instead of testing import_attach for imported GEM buffers, invoke drm_gem_is_imported() to do the test. The helper tests the dma_buf itself while import_attach is just an artifact of the import. Prepares to make import_attach optional. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amd

[PATCH AUTOSEL 6.14 13/34] drm/amdkfd: sriov doesn't support per queue reset

2025-04-14 Thread Sasha Levin
From: Emily Deng [ Upstream commit ba6d8f878d6180d4d0ed0574479fc1e232928184 ] Disable per queue reset for sriov. Signed-off-by: Emily Deng Reviewed-by: Jonathan Kim Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- 1 file changed,

[PATCH v2] drm/amdgpu: Clear overflow for SRIOV

2025-04-14 Thread Emily Deng
For VF, it doesn't have the permission to clear overflow, clear the bit by reset. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 15 +-- drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 + drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 6 +- drivers/gpu/drm/amd/amdg

Re: [PATCH v2 7/9] drm/amdgpu/gfx: Clean up gfx_v7_0_get_csb_buffer

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 3:38 PM Rodrigo Siqueira wrote: > > On 04/13, Alex Deucher wrote: > > On Sat, Apr 12, 2025 at 4:22 PM Rodrigo Siqueira > > wrote: > > > > > > CHIP_KAVERI, CHIP_KABINI, and CHIP_MULLINS have the same buffer > > > manipulation as the default option in the switch case. Remov

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 1:17 PM Khatri, Sunil wrote: > > > On 4/14/2025 8:59 PM, Alex Deucher wrote: > > On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote: > > This is how i see the future of this code and we can do based on it now > itself. > disable_kq = 0, Use kernel queues. > disable_kq =

[PATCH V3 2/4] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-14 Thread Alex Deucher
This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. v2: use idx v3: only stop compute/gfx queues Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h

[PATCH V3 4/4] drm/amdgpu/userq: integrate with enforce isolation

2025-04-14 Thread Alex Deucher
Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. v2: split out variable renaming, add config guards v3: use new

[PATCH 3/4] drm/amdgpu: rename enforce isolation variables

2025-04-14 Thread Alex Deucher
Since they will be used for both KFD and KGD user queues, rename them from kfd to userq. No intended functional change. Acked-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 32 +++

[PATCH 1/4] drm/amdgpu/userq: track the xcp_id associated with the queue

2025-04-14 Thread Alex Deucher
Track this to align with KFD for enforce isolation handling. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h b/drivers/gpu/drm/amd/amdgpu/am

Re: [PATCH 4/4] drm/amdgpu/userq: integrate with enforce isolation

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 1:58 PM Khatri, Sunil wrote: > > If i am not wrong @arvind is already having the patch to remove this > config. Should we use the function pointer check as being used in EOP > and SDMA functions ? The list will be empty if there are no user queues active. Although, think

Re: [PATCH] drm/amdgpu: Add documentation to some parts of the AMDGPU ring and wb

2025-04-14 Thread Mario Limonciello
On 4/12/2025 3:37 PM, Rodrigo Siqueira wrote: Add some random documentation associated with the ring buffer manipulations and writeback. Signed-off-by: Rodrigo Siqueira --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 29 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 37 ++

Re: [PATCH 01/13] drm/amd/display: make sure drm_edid stored in aconnector doesn't leak

2025-04-14 Thread Mario Limonciello
On 4/11/2025 3:08 PM, Melissa Wen wrote: Make sure the drm_edid container stored in aconnector is freed when detroying the aconnector. destroying Fixes: 48edb2a4 ("drm/amd/display: switch amdgpu_dm_connector to use struct drm_edid") Signed-off-by: Melissa Wen Minor nit above. Add to nex

[PATCH] drm/amdgpu/userq: rework driver parameter

2025-04-14 Thread Alex Deucher
Replace disable_kq parameter with user_queue parameter. The parameter has the following logic: -1 = auto (ASIC specific default) 0 = user queues disabled 1 = user queues enabled and kernel queues enabled (if supported) 2 = user queues enabled and kernel queues disabled The default behavior

[PATCH 2/2] drm/amdgpu: Use dma_buf from GEM object instance

2025-04-14 Thread Thomas Zimmermann
Avoid dereferencing struct drm_gem_object.import_attach for the imported dma-buf. The dma_buf field in the GEM object instance refers to the same buffer. Prepares to make import_attach optional. Signed-off-by: Thomas Zimmermann --- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +- drivers/gpu/

Re: [PATCH v1 1/3] drm: function to get process name and pid

2025-04-14 Thread Christian König
Adding Pierre-eric and Tvrtko as well. Am 11.04.25 um 15:04 schrieb Sunil Khatri: > Add helper function which get the process information for > the drm_file and updates the user provided character buffer > with the information of process name and pid as a string. > > Signed-off-by: Sunil Khatri >

[PATCH 0/6] enable switching to new gpu index for hibernate on SRIOV.

2025-04-14 Thread Samuel Zhang
On SRIOV and VM environment, customer may need to switch to new vGPU indexes after hibernate and then resume the VM. For GPUs with XGMI, `vram_start` will change in this case, the VRAM aperture gpu address of VRAM BOs will also change. These gpu addresses need to be updated when resume. But these

Re: [PATCH 3/4] drm/amdgpu: rename enforce isolation variables

2025-04-14 Thread Khatri, Sunil
Acked-by: Sunil Khatri On 4/14/2025 10:42 PM, Alex Deucher wrote: Since they will be used for both KFD and KGD user queues, rename them from kfd to userq. No intended functional change. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/amd/

Re: [PATCH 4/4] drm/amdgpu/userq: integrate with enforce isolation

2025-04-14 Thread Khatri, Sunil
If i am not wrong @arvind  is already having the patch to remove this config. Should we use the function pointer check as being used in EOP and SDMA functions ? Regards Sunil Khatri On 4/14/2025 10:42 PM, Alex Deucher wrote: Enforce isolation serializes access to the GFX IP. User queues are

[PATCH 1/4] drm/amdgpu/userq: track the xcp_id associated with the queue

2025-04-14 Thread Alex Deucher
Track this to align with KFD for enforce isolation handling. Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h b/drivers/gpu/drm/amd/amdgpu/am

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Khatri, Sunil
Series is Reviewed-by: Sunil Khatri On 4/13/2025 9:36 PM, Alex Deucher wrote: Regardless of whether we disable kernel queues, we need to take an extra reference to the pipe interrupts for user queues to make sure they stay enabled in case we disable them for kernel queues. Signed-off-by: Alex

Re: [PATCH 4/4] drm/sdma7: properly reference trap interrupts for userqs

2025-04-14 Thread Khatri, Sunil
Reviewed-by: Sunil Khatri On 4/13/2025 9:36 PM, Alex Deucher wrote: We need to take a reference to the interrupts to make sure they stay enabled even if the kernel queues have disabled them. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 31

Re: [PATCH v2] drm/amdgpu: Add queue id support to the user queue wait IOCTL

2025-04-14 Thread Marek Olšák
On Mon, Apr 14, 2025 at 7:33 AM Christian König wrote: > Am 12.04.25 um 10:03 schrieb Arunpravin Paneer Selvam: > > Add queue id support to the user queue wait IOCTL > > drm_amdgpu_userq_wait structure. > > > > This is required to retrieve the wait user queue and maintain > > the fence driver ref

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Khatri, Sunil
On 4/14/2025 10:54 PM, Alex Deucher wrote: On Mon, Apr 14, 2025 at 1:17 PM Khatri, Sunil wrote: On 4/14/2025 8:59 PM, Alex Deucher wrote: On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote: This is how i see the future of this code and we can do based on it now itself. disable_kq = 0, Use k

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Khatri, Sunil
On 4/14/2025 8:59 PM, Alex Deucher wrote: On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote: This is how i see the future of this code and we can do based on it now itself. disable_kq = 0, Use kernel queues. disable_kq = 1, Use User queues. disable_kq = 0 means allow kernel queues and user q

[PATCH 3/4] drm/amdgpu: rename enforce isolation variables

2025-04-14 Thread Alex Deucher
Since they will be used for both KFD and KGD user queues, rename them from kfd to userq. No intended functional change. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 32 +++--- drivers/gpu/drm/amd

[PATCH 2/4] drm/amdgpu/userq: add helpers to start/stop scheduling

2025-04-14 Thread Alex Deucher
This will be used to stop/start user queue scheduling for example when switching between kernel and user queues when enforce isolation is enabled. v2: use idx Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amd

[PATCH 4/4] drm/amdgpu/userq: integrate with enforce isolation

2025-04-14 Thread Alex Deucher
Enforce isolation serializes access to the GFX IP. User queues are isolated in the MES scheduler, but we still need to serialize between kernel queues and user queues. For enforce isolation, group KGD user queues with KFD user queues. v2: split out variable renaming, add config guards Signed-off

Re: [PATCH v2 2/2] drm/amdgpu: Clean up error handling in amdgpu_userq_fence_driver_alloc()

2025-04-14 Thread Alex Deucher
Applied the series. Thanks! On Mon, Apr 14, 2025 at 12:48 AM Yadav, Arvind wrote: > > Reviewed-by:Arvind Yadav > > On 4/12/2025 8:09 PM, Dan Carpenter wrote: > > 1) Checkpatch complains if we print an error message for kzalloc() > > failure. The kzalloc() failure already has it's own error

[PATCH AUTOSEL 6.14 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Sasha Levin
From: Christian König [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] Try pinning into VRAM to allow P2P with RDMA NICs without ODP support if all attachments can do P2P. If any attachment can't do P2P just pin into GTT instead. Acked-by: Simona Vetter Signed-off-by: Christian Kön

[PATCH 1/2] drm/amdkfd: fix NULL check mistake for process smi event

2025-04-14 Thread Eric Huang
The mistake will lead to NULL kernel oops, so fix it. Fixes: 56ed4241e9fe ("drm/amdkfd: add smi events for process start and end") Signed-off-by: Eric Huang --- drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/am

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote: > > This is how i see the future of this code and we can do based on it now > itself. > disable_kq = 0, Use kernel queues. > disable_kq = 1, Use User queues. disable_kq = 0 means allow kernel queues and user queues. disable_kq =1 means disabl

RE: [PATCH 2/2] drm/amdkfd: fix a bug of smi event for superuser

2025-04-14 Thread Russell, Kent
[Public] Series is Reviewed-by: Kent Russell > -Original Message- > From: amd-gfx On Behalf Of Eric Huang > Sent: Monday, April 14, 2025 11:33 AM > To: amd-gfx@lists.freedesktop.org > Cc: Huang, JinHuiEric > Subject: [PATCH 2/2] drm/amdkfd: fix a bug of smi event for superuser > > r

Re: [PATCH 3/4] drm/sdma6: properly reference trap interrupts for userqs

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 5:59 AM Khatri, Sunil wrote: > > Same explanation as patch 1 of the series here too. Do we want to depend > on the disable_kq flag solely to enable/disable sdma trap. > IIUC, we dont want to do it in case of kernel queues at all and only > needed when using userqueue and th

Re: [PATCH] drm/amdgpu: add missing DCE6 to dce_version_to_string()

2025-04-14 Thread Alex Deucher
Applied. Thanks! On Sun, Apr 13, 2025 at 4:51 PM Alexandre Demers wrote: > > Missing DCE 6.0 6.1 and 6.4 are identified as UNKNOWN. Fix this. > > Signed-off-by: Alexandre Demers > --- > drivers/gpu/drm/amd/display/dc/dc_helper.c | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/dr

Re: [PATCH 6/6] drm/amdgpu: fix typo in bios_parser.c

2025-04-14 Thread Alex Deucher
Applied the series. Thanks. Alex On Mon, Apr 7, 2025 at 10:23 PM Alexandre Demers wrote: > > Probably a cut and paste error from using get_integrated_info_v8's comment. > This has to be get_integrated_info_v9 > > Signed-off-by: Alexandre Demers > --- > drivers/gpu/drm/amd/display/dc/bios/bios

RE: [PATCH] drm/amdgpu/userq: move runpm handling into core userq code

2025-04-14 Thread Liu, Shaoyun
[AMD Official Use Only - AMD Internal Distribution Only] Looks good to me . Reviewed-by: Shaoyun.liu -Original Message- From: amd-gfx On Behalf Of Alex Deucher Sent: Sunday, April 13, 2025 2:24 PM To: amd-gfx@lists.freedesktop.org Cc: Deucher, Alexander Subject: [PATCH] drm/amdgpu/user

[PATCH 09/11] drm/radeon: Test for imported buffers with drm_gem_is_imported()

2025-04-14 Thread Thomas Zimmermann
Instead of testing import_attach for imported GEM buffers, invoke drm_gem_is_imported() to do the test. The helper tests the dma_buf itself while import_attach is just an artifact of the import. Prepares to make import_attach optional. Signed-off-by: Thomas Zimmermann Cc: Alex Deucher Cc: "Chris

Re: [Linaro-mm-sig] [PATCH AUTOSEL 6.13 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 9:28 AM Sasha Levin wrote: > > From: Christian König > > [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] > > Try pinning into VRAM to allow P2P with RDMA NICs without ODP > support if all attachments can do P2P. If any attachment can't do > P2P just pin into G

[PATCH AUTOSEL 6.13 14/34] drm/amdgpu: Increase KIQ invalidate_tlbs timeout

2025-04-14 Thread Sasha Levin
From: Jay Cornwall [ Upstream commit 3666ed821832f42baaf25f362680dda603cde732 ] KIQ invalidate_tlbs request has been seen to marginally exceed the configured 100 ms timeout on systems under load. All other KIQ requests in the driver use a 10 second timeout. Use a similar timeout implementation

Re: [PATCH AUTOSEL 6.12 11/30] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 9:29 AM Sasha Levin wrote: > > From: Christian König > > [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] > > Try pinning into VRAM to allow P2P with RDMA NICs without ODP > support if all attachments can do P2P. If any attachment can't do > P2P just pin into G

Re: [PATCH AUTOSEL 6.14 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 9:26 AM Sasha Levin wrote: > > From: Christian König > > [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] > > Try pinning into VRAM to allow P2P with RDMA NICs without ODP > support if all attachments can do P2P. If any attachment can't do > P2P just pin into G

Re: [PATCH RESEND] drm/amd/pm/powerplay/smumgr/fiji_smumgr: Fix wrong return value of fiji_populate_smc_boot_level()

2025-04-14 Thread Alex Deucher
On Mon, Apr 14, 2025 at 3:24 AM Wentao Liang wrote: > > The return value of fiji_populate_smc_boot_level() is always 0, which > represent the failure of the function. The result of phm_find_boot_level() > should be recored and return. An error handling is also needed to > phm_find_boot_level() to

[PATCH AUTOSEL 6.12 11/30] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Sasha Levin
From: Christian König [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] Try pinning into VRAM to allow P2P with RDMA NICs without ODP support if all attachments can do P2P. If any attachment can't do P2P just pin into GTT instead. Acked-by: Simona Vetter Signed-off-by: Christian Kön

[PATCH AUTOSEL 6.12 10/30] drm/amdgpu: Increase KIQ invalidate_tlbs timeout

2025-04-14 Thread Sasha Levin
From: Jay Cornwall [ Upstream commit 3666ed821832f42baaf25f362680dda603cde732 ] KIQ invalidate_tlbs request has been seen to marginally exceed the configured 100 ms timeout on systems under load. All other KIQ requests in the driver use a 10 second timeout. Use a similar timeout implementation

[PATCH AUTOSEL 6.13 15/34] drm/amdgpu: allow pinning DMA-bufs into VRAM if all importers can do P2P

2025-04-14 Thread Sasha Levin
From: Christian König [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ] Try pinning into VRAM to allow P2P with RDMA NICs without ODP support if all attachments can do P2P. If any attachment can't do P2P just pin into GTT instead. Acked-by: Simona Vetter Signed-off-by: Christian Kön

[PATCH 0/2] drm/amdgpu: Avoid struct drm_gem_object.import_attach

2025-04-14 Thread Thomas Zimmermann
Avoid the use of struct drm_gem_object.import_attach to get the object's dma-buf or test for an imported buffer. The import_attach field in struct drm_gem_object is an artifact of the import process, but should not be used otherwise. The helper drm_gem_is_imported() tests if a GEM object's buffer

[PATCH AUTOSEL 6.13 13/34] drm/amdkfd: sriov doesn't support per queue reset

2025-04-14 Thread Sasha Levin
From: Emily Deng [ Upstream commit ba6d8f878d6180d4d0ed0574479fc1e232928184 ] Disable per queue reset for sriov. Signed-off-by: Emily Deng Reviewed-by: Jonathan Kim Signed-off-by: Alex Deucher Signed-off-by: Sasha Levin --- drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++- 1 file changed,

[PATCH 2/6] drm/amdgpu: update cached GPU addresses for PSP and ucode

2025-04-14 Thread Samuel Zhang
2 reasons for this change: 1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is based on pdb0, it is not compatible with PSP and ucode, it need to updated to use original gpu address. 2. Since original address will change after switching to new GPU index after hibernation, it need to

Re: [PATCH v2] drm/amdgpu: Add queue id support to the user queue wait IOCTL

2025-04-14 Thread Christian König
Am 12.04.25 um 10:03 schrieb Arunpravin Paneer Selvam: > Add queue id support to the user queue wait IOCTL > drm_amdgpu_userq_wait structure. > > This is required to retrieve the wait user queue and maintain > the fence driver references in it so that the user queue in > the same context releases t

[PATCH 4/6] drm/amdgpu: enable pdb0 for hibernation on SRIOV

2025-04-14 Thread Samuel Zhang
When switching to new GPU index after hibernation and then resume, VRAM offset of each VRAM BO will be changed, and the cached gpu addresses needed to updated. This is to enable pdb0 and switch to use pdb0-based virtual gpu address by default in amdgpu_bo_create_reserved(). since the virtual addre

[PATCH 6/6] drm/amdgpu: fix fence fallback timer expired error

2025-04-14 Thread Samuel Zhang
IH is not working after switching a new gpu index for the first time. IH handler function need to be re-registered with kernel after switching to new gpu index. Signed-off-by: Samuel Zhang Change-Id: Idece1c8fce24032fd08f5a8b6ac23793c51e56dd --- drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 7 +

[PATCH 5/6] drm/amdgpu: fix sdma ring test fail when resume from hibernation

2025-04-14 Thread Samuel Zhang
gart tlb may be staled when switch to a new gpu index. this cause gpu fetchs wrong data from gtt memory. Flush gart tlb at the end of gmc resume to fix it. Signed-off-by: Samuel Zhang Change-Id: If2a3780319f5ecf3dcb0f1c07f85151ed65f522d --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +- 1 file ch

[PATCH 3/6] drm/amdgpu: update cached GPU addresses for SMU

2025-04-14 Thread Samuel Zhang
2 reasons for this change: 1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is based on pdb0, it is not compatible with SMU, it need to updated to use original gpu address. 2. Since original gpu address will change after switching to new GPU index after hibernation, it need to be up

[PATCH 1/6] drm/amdgpu: update XGMI physical node id and GMC configs on resume

2025-04-14 Thread Samuel Zhang
For virtual machine with vGPUs in SRIOV single device mode and XGMI is enabled, XGMI physical node ids may change when waking up from hiberation with different vGPU devices. So update XGMI physical node ids on resume. Update GPU memory controller configuration on resume if XGMI physical node ids a

Re: [PATCH 13/13] drm/amd/display: move dc_sink from dc_edid to drm_edid

2025-04-14 Thread Jani Nikula
On Fri, 11 Apr 2025, Melissa Wen wrote: > +void dc_edid_copy_edid_to_sink(struct dc_sink *sink) > +{ > + const struct edid *edid; > + uint32_t edid_length; > + > + edid = drm_edid_raw(sink->drm_edid); // FIXME: Get rid of drm_edid_raw() > + edid_length = EDID_LENGTH * (edid->extens

Re: [PATCH 03/13] drm/amd/display: parse display name from drm_eld

2025-04-14 Thread Jani Nikula
On Fri, 11 Apr 2025, Melissa Wen wrote: > We don't need to parse dc_edid to get the display name since it's > already set in drm_eld which in turn had it values updated when updating > connector with the opaque drm_edid. > > Signed-off-by: Melissa Wen > --- > .../gpu/drm/amd/display/amdgpu_dm/am

Re: [PATCH 3/4] drm/sdma6: properly reference trap interrupts for userqs

2025-04-14 Thread Khatri, Sunil
Same explanation as patch 1 of the series here too. Do we want to depend on the disable_kq flag solely to enable/disable sdma trap. IIUC, we dont want to do it in case of kernel queues at all and only needed when using userqueue and that is taken care by using the flag disable_kq. Regards Suni

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Khatri, Sunil
I feel existing implementation makes more sense. Using the disable_kq flag to avoid get/put completely. Get/put for userqueues only here and for kernel anyways we are handling in ring_init. if (adev->userq_funcs[AMDGPU_HW_IP_GFX]), this if condition will always be valid once we enable it based

Re: [PATCH 1/4] drm/amdgpu/gfx11: properly reference EOP interrupts for userqs

2025-04-14 Thread Khatri, Sunil
This is how i see the future of this code and we can do based on it now itself. disable_kq = 0, Use kernel queues. disable_kq = 1, Use User queues. In case of kernel queues we should not be even calling gfx_v11_0_set_userq_eop_interrupts at all. Instead its better if we add a this check "if (a

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-14 Thread Christian König
Am 13.04.25 um 13:31 schrieb Fedor Pchelkin: > On Thu, 10. Apr 11:07, Christian König wrote: >> Am 09.04.25 um 19:27 schrieb Linus Torvalds: >>> The VM layer allows larger allocations. But the "this is a simple >>> allocation, choose kmalloc or vmalloc automatically based on size" >>> helper says "

Re: [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-14 Thread Christian König
Coming back to the original patch. Am 08.04.25 um 11:17 schrieb Denis Arefev: > The user can set any value to the variable ‘bo_number’, via the ioctl > command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic > expression ‘in->bo_number * in->bo_info_size’, which is prone to > overflow. A

Re: [PATCH] drm/amdgpu/userq: move runpm handling into core userq code

2025-04-14 Thread Khatri, Sunil
On 4/13/2025 11:54 PM, Alex Deucher wrote: Pull it out of the MES code and into the generic code. It's not MES specific and needs to be applied to all user queues regardless of the backend. Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 13 + driv

[PATCH v3] drm/amd/display: Add error check for avi and vendor infoframe setup function

2025-04-14 Thread Wentao Liang
The function fill_stream_properties_from_drm_display_mode() calls the function drm_hdmi_avi_infoframe_from_display_mode() and the function drm_hdmi_vendor_infoframe_from_display_mode(), but does not check its return value. Log the error messages to prevent silent failure if either function fails.

Re: [lvc-project] [PATCH] drm/amdgpu: check a user-provided number of BOs in list

2025-04-14 Thread Fedor Pchelkin
On Thu, 10. Apr 11:07, Christian König wrote: > Am 09.04.25 um 19:27 schrieb Linus Torvalds: > > The VM layer allows larger allocations. But the "this is a simple > > allocation, choose kmalloc or vmalloc automatically based on size" > > helper says "you are being simple, I'm going to check your ar

[PATCH RESEND] drm/amd/pm/powerplay/smumgr/fiji_smumgr: Fix wrong return value of fiji_populate_smc_boot_level()

2025-04-14 Thread Wentao Liang
The return value of fiji_populate_smc_boot_level() is always 0, which represent the failure of the function. The result of phm_find_boot_level() should be recored and return. An error handling is also needed to phm_find_boot_level() to reset the boot level when the function fails. A proper implemen