> -Original Message-
> From: Simon Ser
> Sent: Tuesday, April 15, 2025 11:47 AM
> To: Shankar, Uma
> Cc: Alex Hung ; dri-de...@lists.freedesktop.org; amd-
> g...@lists.freedesktop.org; intel-...@lists.freedesktop.org; wayland-
> de...@lists.freedesktop.org; harry.wentl...@amd.com; leo..
On Tuesday, April 15th, 2025 at 08:09, Shankar, Uma
wrote:
> We want to have just one change in the way we expose the hardware
> capabilities else
> all looks good in general.
I would really recommend leaving this as a follow-up extension. It's a
complicated addition that requires more discuss
> -Original Message-
> From: Alex Hung
> Sent: Thursday, March 27, 2025 5:17 AM
> To: dri-de...@lists.freedesktop.org; amd-gfx@lists.freedesktop.org
> Cc: wayland-de...@lists.freedesktop.org; harry.wentl...@amd.com;
> alex.h...@amd.com; leo@amd.com; ville.syrj...@linux.intel.com;
>
On 04/14, Mario Limonciello wrote:
> On 4/12/2025 3:37 PM, Rodrigo Siqueira wrote:
> > Add some random documentation associated with the ring buffer
> > manipulations and writeback.
> >
> > Signed-off-by: Rodrigo Siqueira
> > ---
> > drivers/gpu/drm/amd/amdgpu/amdgpu.h | 29 +++
[AMD Official Use Only - AMD Internal Distribution Only]
Ping..
Emily Deng
Best Wishes
>-Original Message-
>From: Emily Deng
>Sent: Tuesday, April 15, 2025 9:33 AM
>To: amd-gfx@lists.freedesktop.org
>Cc: Deng, Emily
>Subject: [PATCH v2] drm/amdgpu: Clear overflow for SRIOV
>
>For
Enable host limit metrics support for smuv_13_0_12
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
b/drivers/gpu/drm/amd/pm/swsm
Enable host limit metrics support for smuv_13_0_6
Signed-off-by: Asad Kamal
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu13/smu_v13_0_6_ppt.c
b/drivers/gpu/drm/amd/pm/swsm
rocm-smi with superuser permission doesn't show some
of smi events, i.e. page fault/migration, because the
condition of "(events & all)" is false. Superuser
should be able to detect all events, the condiiton of
"(events & all)" seems redundant, so removing it will
fix the issue.
Signed-off-by: Eri
Instead of testing import_attach for imported GEM buffers, invoke
drm_gem_is_imported() to do the test. The helper tests the dma_buf
itself while import_attach is just an artifact of the import. Prepares
to make import_attach optional.
Signed-off-by: Thomas Zimmermann
---
drivers/gpu/drm/amd/amd
From: Emily Deng
[ Upstream commit ba6d8f878d6180d4d0ed0574479fc1e232928184 ]
Disable per queue reset for sriov.
Signed-off-by: Emily Deng
Reviewed-by: Jonathan Kim
Signed-off-by: Alex Deucher
Signed-off-by: Sasha Levin
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++-
1 file changed,
For VF, it doesn't have the permission to clear overflow, clear the bit
by reset.
Signed-off-by: Emily Deng
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ih.c | 15 +--
drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h | 1 +
drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 6 +-
drivers/gpu/drm/amd/amdg
On Mon, Apr 14, 2025 at 3:38 PM Rodrigo Siqueira wrote:
>
> On 04/13, Alex Deucher wrote:
> > On Sat, Apr 12, 2025 at 4:22 PM Rodrigo Siqueira
> > wrote:
> > >
> > > CHIP_KAVERI, CHIP_KABINI, and CHIP_MULLINS have the same buffer
> > > manipulation as the default option in the switch case. Remov
On Mon, Apr 14, 2025 at 1:17 PM Khatri, Sunil wrote:
>
>
> On 4/14/2025 8:59 PM, Alex Deucher wrote:
>
> On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote:
>
> This is how i see the future of this code and we can do based on it now
> itself.
> disable_kq = 0, Use kernel queues.
> disable_kq =
This will be used to stop/start user queue scheduling for
example when switching between kernel and user queues when
enforce isolation is enabled.
v2: use idx
v3: only stop compute/gfx queues
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h
Enforce isolation serializes access to the GFX IP. User
queues are isolated in the MES scheduler, but we still
need to serialize between kernel queues and user queues.
For enforce isolation, group KGD user queues with KFD user
queues.
v2: split out variable renaming, add config guards
v3: use new
Since they will be used for both KFD and KGD user queues,
rename them from kfd to userq. No intended functional
change.
Acked-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 32 +++
Track this to align with KFD for enforce isolation
handling.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h
b/drivers/gpu/drm/amd/amdgpu/am
On Mon, Apr 14, 2025 at 1:58 PM Khatri, Sunil wrote:
>
> If i am not wrong @arvind is already having the patch to remove this
> config. Should we use the function pointer check as being used in EOP
> and SDMA functions ?
The list will be empty if there are no user queues active. Although,
think
On 4/12/2025 3:37 PM, Rodrigo Siqueira wrote:
Add some random documentation associated with the ring buffer
manipulations and writeback.
Signed-off-by: Rodrigo Siqueira
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 29 ++-
drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 37 ++
On 4/11/2025 3:08 PM, Melissa Wen wrote:
Make sure the drm_edid container stored in aconnector is freed when
detroying the aconnector.
destroying
Fixes: 48edb2a4 ("drm/amd/display: switch amdgpu_dm_connector to use struct
drm_edid")
Signed-off-by: Melissa Wen
Minor nit above. Add to nex
Replace disable_kq parameter with user_queue parameter.
The parameter has the following logic:
-1 = auto (ASIC specific default)
0 = user queues disabled
1 = user queues enabled and kernel queues enabled (if supported)
2 = user queues enabled and kernel queues disabled
The default behavior
Avoid dereferencing struct drm_gem_object.import_attach for the
imported dma-buf. The dma_buf field in the GEM object instance refers
to the same buffer. Prepares to make import_attach optional.
Signed-off-by: Thomas Zimmermann
---
drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 2 +-
drivers/gpu/
Adding Pierre-eric and Tvrtko as well.
Am 11.04.25 um 15:04 schrieb Sunil Khatri:
> Add helper function which get the process information for
> the drm_file and updates the user provided character buffer
> with the information of process name and pid as a string.
>
> Signed-off-by: Sunil Khatri
>
On SRIOV and VM environment, customer may need to switch to new vGPU indexes
after hibernate and then resume the VM. For GPUs with XGMI, `vram_start` will
change in this case, the VRAM aperture gpu address of VRAM BOs will also change.
These gpu addresses need to be updated when resume. But these
Acked-by: Sunil Khatri
On 4/14/2025 10:42 PM, Alex Deucher wrote:
Since they will be used for both KFD and KGD user queues,
rename them from kfd to userq. No intended functional
change.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
drivers/gpu/drm/amd/
If i am not wrong @arvind is already having the patch to remove this
config. Should we use the function pointer check as being used in EOP
and SDMA functions ?
Regards
Sunil Khatri
On 4/14/2025 10:42 PM, Alex Deucher wrote:
Enforce isolation serializes access to the GFX IP. User
queues are
Track this to align with KFD for enforce isolation
handling.
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.h
b/drivers/gpu/drm/amd/amdgpu/am
Series is Reviewed-by: Sunil Khatri
On 4/13/2025 9:36 PM, Alex Deucher wrote:
Regardless of whether we disable kernel queues, we need
to take an extra reference to the pipe interrupts for
user queues to make sure they stay enabled in case we
disable them for kernel queues.
Signed-off-by: Alex
Reviewed-by: Sunil Khatri
On 4/13/2025 9:36 PM, Alex Deucher wrote:
We need to take a reference to the interrupts to make
sure they stay enabled even if the kernel queues have
disabled them.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 31
On Mon, Apr 14, 2025 at 7:33 AM Christian König
wrote:
> Am 12.04.25 um 10:03 schrieb Arunpravin Paneer Selvam:
> > Add queue id support to the user queue wait IOCTL
> > drm_amdgpu_userq_wait structure.
> >
> > This is required to retrieve the wait user queue and maintain
> > the fence driver ref
On 4/14/2025 10:54 PM, Alex Deucher wrote:
On Mon, Apr 14, 2025 at 1:17 PM Khatri, Sunil wrote:
On 4/14/2025 8:59 PM, Alex Deucher wrote:
On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote:
This is how i see the future of this code and we can do based on it now itself.
disable_kq = 0, Use k
On 4/14/2025 8:59 PM, Alex Deucher wrote:
On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote:
This is how i see the future of this code and we can do based on it now itself.
disable_kq = 0, Use kernel queues.
disable_kq = 1, Use User queues.
disable_kq = 0 means allow kernel queues and user q
Since they will be used for both KFD and KGD user queues,
rename them from kfd to userq. No intended functional
change.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c| 32 +++---
drivers/gpu/drm/amd
This will be used to stop/start user queue scheduling for
example when switching between kernel and user queues when
enforce isolation is enabled.
v2: use idx
Reviewed-by: Sunil Khatri
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 +
drivers/gpu/drm/amd/amd
Enforce isolation serializes access to the GFX IP. User
queues are isolated in the MES scheduler, but we still
need to serialize between kernel queues and user queues.
For enforce isolation, group KGD user queues with KFD user
queues.
v2: split out variable renaming, add config guards
Signed-off
Applied the series. Thanks!
On Mon, Apr 14, 2025 at 12:48 AM Yadav, Arvind wrote:
>
> Reviewed-by:Arvind Yadav
>
> On 4/12/2025 8:09 PM, Dan Carpenter wrote:
> > 1) Checkpatch complains if we print an error message for kzalloc()
> > failure. The kzalloc() failure already has it's own error
From: Christian König
[ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter
Signed-off-by: Christian Kön
The mistake will lead to NULL kernel oops, so fix it.
Fixes: 56ed4241e9fe ("drm/amdkfd: add smi events for process start and end")
Signed-off-by: Eric Huang
---
drivers/gpu/drm/amd/amdkfd/kfd_smi_events.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/am
On Mon, Apr 14, 2025 at 5:44 AM Khatri, Sunil wrote:
>
> This is how i see the future of this code and we can do based on it now
> itself.
> disable_kq = 0, Use kernel queues.
> disable_kq = 1, Use User queues.
disable_kq = 0 means allow kernel queues and user queues. disable_kq
=1 means disabl
[Public]
Series is
Reviewed-by: Kent Russell
> -Original Message-
> From: amd-gfx On Behalf Of Eric Huang
> Sent: Monday, April 14, 2025 11:33 AM
> To: amd-gfx@lists.freedesktop.org
> Cc: Huang, JinHuiEric
> Subject: [PATCH 2/2] drm/amdkfd: fix a bug of smi event for superuser
>
> r
On Mon, Apr 14, 2025 at 5:59 AM Khatri, Sunil wrote:
>
> Same explanation as patch 1 of the series here too. Do we want to depend
> on the disable_kq flag solely to enable/disable sdma trap.
> IIUC, we dont want to do it in case of kernel queues at all and only
> needed when using userqueue and th
Applied. Thanks!
On Sun, Apr 13, 2025 at 4:51 PM Alexandre Demers
wrote:
>
> Missing DCE 6.0 6.1 and 6.4 are identified as UNKNOWN. Fix this.
>
> Signed-off-by: Alexandre Demers
> ---
> drivers/gpu/drm/amd/display/dc/dc_helper.c | 6 ++
> 1 file changed, 6 insertions(+)
>
> diff --git a/dr
Applied the series. Thanks.
Alex
On Mon, Apr 7, 2025 at 10:23 PM Alexandre Demers
wrote:
>
> Probably a cut and paste error from using get_integrated_info_v8's comment.
> This has to be get_integrated_info_v9
>
> Signed-off-by: Alexandre Demers
> ---
> drivers/gpu/drm/amd/display/dc/bios/bios
[AMD Official Use Only - AMD Internal Distribution Only]
Looks good to me .
Reviewed-by: Shaoyun.liu
-Original Message-
From: amd-gfx On Behalf Of Alex Deucher
Sent: Sunday, April 13, 2025 2:24 PM
To: amd-gfx@lists.freedesktop.org
Cc: Deucher, Alexander
Subject: [PATCH] drm/amdgpu/user
Instead of testing import_attach for imported GEM buffers, invoke
drm_gem_is_imported() to do the test. The helper tests the dma_buf
itself while import_attach is just an artifact of the import. Prepares
to make import_attach optional.
Signed-off-by: Thomas Zimmermann
Cc: Alex Deucher
Cc: "Chris
On Mon, Apr 14, 2025 at 9:28 AM Sasha Levin wrote:
>
> From: Christian König
>
> [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
>
> Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> support if all attachments can do P2P. If any attachment can't do
> P2P just pin into G
From: Jay Cornwall
[ Upstream commit 3666ed821832f42baaf25f362680dda603cde732 ]
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation
On Mon, Apr 14, 2025 at 9:29 AM Sasha Levin wrote:
>
> From: Christian König
>
> [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
>
> Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> support if all attachments can do P2P. If any attachment can't do
> P2P just pin into G
On Mon, Apr 14, 2025 at 9:26 AM Sasha Levin wrote:
>
> From: Christian König
>
> [ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
>
> Try pinning into VRAM to allow P2P with RDMA NICs without ODP
> support if all attachments can do P2P. If any attachment can't do
> P2P just pin into G
On Mon, Apr 14, 2025 at 3:24 AM Wentao Liang wrote:
>
> The return value of fiji_populate_smc_boot_level() is always 0, which
> represent the failure of the function. The result of phm_find_boot_level()
> should be recored and return. An error handling is also needed to
> phm_find_boot_level() to
From: Christian König
[ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter
Signed-off-by: Christian Kön
From: Jay Cornwall
[ Upstream commit 3666ed821832f42baaf25f362680dda603cde732 ]
KIQ invalidate_tlbs request has been seen to marginally exceed the
configured 100 ms timeout on systems under load.
All other KIQ requests in the driver use a 10 second timeout. Use a
similar timeout implementation
From: Christian König
[ Upstream commit f5e7fabd1f5c65b2e077efcdb118cfa67eae7311 ]
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Acked-by: Simona Vetter
Signed-off-by: Christian Kön
Avoid the use of struct drm_gem_object.import_attach to get the
object's dma-buf or test for an imported buffer. The import_attach
field in struct drm_gem_object is an artifact of the import process,
but should not be used otherwise.
The helper drm_gem_is_imported() tests if a GEM object's buffer
From: Emily Deng
[ Upstream commit ba6d8f878d6180d4d0ed0574479fc1e232928184 ]
Disable per queue reset for sriov.
Signed-off-by: Emily Deng
Reviewed-by: Jonathan Kim
Signed-off-by: Alex Deucher
Signed-off-by: Sasha Levin
---
drivers/gpu/drm/amd/amdkfd/kfd_topology.c | 3 ++-
1 file changed,
2 reasons for this change:
1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is based
on pdb0, it is not compatible with PSP and ucode, it need to updated
to use original gpu address.
2. Since original address will change after switching to new GPU
index after hibernation, it need to
Am 12.04.25 um 10:03 schrieb Arunpravin Paneer Selvam:
> Add queue id support to the user queue wait IOCTL
> drm_amdgpu_userq_wait structure.
>
> This is required to retrieve the wait user queue and maintain
> the fence driver references in it so that the user queue in
> the same context releases t
When switching to new GPU index after hibernation and then resume,
VRAM offset of each VRAM BO will be changed, and the cached gpu
addresses needed to updated.
This is to enable pdb0 and switch to use pdb0-based virtual gpu
address by default in amdgpu_bo_create_reserved(). since the virtual
addre
IH is not working after switching a new gpu index for the first time.
IH handler function need to be re-registered with kernel after switching
to new gpu index.
Signed-off-by: Samuel Zhang
Change-Id: Idece1c8fce24032fd08f5a8b6ac23793c51e56dd
---
drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c | 7 +
gart tlb may be staled when switch to a new gpu index. this cause gpu
fetchs wrong data from gtt memory. Flush gart tlb at the end of gmc
resume to fix it.
Signed-off-by: Samuel Zhang
Change-Id: If2a3780319f5ecf3dcb0f1c07f85151ed65f522d
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +-
1 file ch
2 reasons for this change:
1. when pdb0 is enabled, gpu addr from amdgpu_bo_create_kernel() is
based on pdb0, it is not compatible with SMU, it need to updated
to use original gpu address.
2. Since original gpu address will change after switching to new GPU
index after hibernation, it need to be up
For virtual machine with vGPUs in SRIOV single device mode and XGMI
is enabled, XGMI physical node ids may change when waking up from
hiberation with different vGPU devices. So update XGMI physical node
ids on resume.
Update GPU memory controller configuration on resume if XGMI physical
node ids a
On Fri, 11 Apr 2025, Melissa Wen wrote:
> +void dc_edid_copy_edid_to_sink(struct dc_sink *sink)
> +{
> + const struct edid *edid;
> + uint32_t edid_length;
> +
> + edid = drm_edid_raw(sink->drm_edid); // FIXME: Get rid of drm_edid_raw()
> + edid_length = EDID_LENGTH * (edid->extens
On Fri, 11 Apr 2025, Melissa Wen wrote:
> We don't need to parse dc_edid to get the display name since it's
> already set in drm_eld which in turn had it values updated when updating
> connector with the opaque drm_edid.
>
> Signed-off-by: Melissa Wen
> ---
> .../gpu/drm/amd/display/amdgpu_dm/am
Same explanation as patch 1 of the series here too. Do we want to depend
on the disable_kq flag solely to enable/disable sdma trap.
IIUC, we dont want to do it in case of kernel queues at all and only
needed when using userqueue and that is taken care by using the flag
disable_kq.
Regards
Suni
I feel existing implementation makes more sense. Using the disable_kq
flag to avoid get/put completely. Get/put for userqueues only here and
for kernel anyways we are handling in ring_init.
if (adev->userq_funcs[AMDGPU_HW_IP_GFX]), this if condition will always be
valid once we enable it based
This is how i see the future of this code and we can do based on it now
itself.
disable_kq = 0, Use kernel queues.
disable_kq = 1, Use User queues.
In case of kernel queues we should not be even calling
gfx_v11_0_set_userq_eop_interrupts at all. Instead its better if we add
a this check "if (a
Am 13.04.25 um 13:31 schrieb Fedor Pchelkin:
> On Thu, 10. Apr 11:07, Christian König wrote:
>> Am 09.04.25 um 19:27 schrieb Linus Torvalds:
>>> The VM layer allows larger allocations. But the "this is a simple
>>> allocation, choose kmalloc or vmalloc automatically based on size"
>>> helper says "
Coming back to the original patch.
Am 08.04.25 um 11:17 schrieb Denis Arefev:
> The user can set any value to the variable ‘bo_number’, via the ioctl
> command DRM_IOCTL_AMDGPU_BO_LIST. This will affect the arithmetic
> expression ‘in->bo_number * in->bo_info_size’, which is prone to
> overflow. A
On 4/13/2025 11:54 PM, Alex Deucher wrote:
Pull it out of the MES code and into the generic code.
It's not MES specific and needs to be applied to all user
queues regardless of the backend.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/amdgpu_userqueue.c | 13 +
driv
The function fill_stream_properties_from_drm_display_mode() calls the
function drm_hdmi_avi_infoframe_from_display_mode() and the
function drm_hdmi_vendor_infoframe_from_display_mode(), but does
not check its return value. Log the error messages to prevent silent
failure if either function fails.
On Thu, 10. Apr 11:07, Christian König wrote:
> Am 09.04.25 um 19:27 schrieb Linus Torvalds:
> > The VM layer allows larger allocations. But the "this is a simple
> > allocation, choose kmalloc or vmalloc automatically based on size"
> > helper says "you are being simple, I'm going to check your ar
The return value of fiji_populate_smc_boot_level() is always 0, which
represent the failure of the function. The result of phm_find_boot_level()
should be recored and return. An error handling is also needed to
phm_find_boot_level() to reset the boot level when the function fails.
A proper implemen
73 matches
Mail list logo