i but
kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line
debugfs_remove_recursive(entry->proc_dentry);
tries to remove /sys/kernel/debug/kfd/proc/ while
/sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel
NULL pointer.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/am
Starting from MEC v97, GC 9.4.2 supports chain runlists of XNACK+/XNACK-
processes.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 3 +++
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.c | 12
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_2.h | 1 +
3 files changed, 16
If the MEC firmware supports chaining runlists of XNACK+/XNACK-
processes, set SQ_CONFIG1 chicken bit and SET_RESOURCES bit 28.
When the MEC/HWS supports it, KFD checks the XNACK+/XNACK- processes mix
happens or not. If it does, enter over-subscription.
Signed-off-by: Amber Lin
Reviewed-by
When submitting MQD to CP, set SDMA_RLCx_IB_CNTL/SWITCH_INSIDE_IB bit so
it'll allow SDMA preemption if there is a massive command buffer of
long-running SDMA commands.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 4
1 file changed, 4 insertions(+)
A nitpick below. With that addressed,
Reviewed-by: Amber Lin
Regards,
Amber
On 3/27/25 13:47, Apurv Mishra wrote:
remove workaround code for the early engineering samples
GC v9.4.3 SOCs with revID 0 - GFX 940 & 941 - from driver
Remove "- GFX 940 & 941 - from driver"
Correct F8_MODE setting for gfx950 that was removed
Fixes: 1a9dbc31d234
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c
b
Reviewed-by: Amber Lin
Regards,
Amber
On 3/6/25 14:52, Harish Kasiviswanathan wrote:
0x9
From: Alex Sierra
Default F8_MODE should be OCP format on gfx950.
Signed-off-by: Alex Sierra
Reviewed-by: Harish Kasiviswanathan
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_v9.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a
Reviewed-by: Amber Lin
Regards,
Amber
On 3/6/25 14:52, Harish Kasiviswanathan wrote:
Define set_cache_memory_policy() for these asics and move all static
changes from update_qpd() which is called each time a queue is created
to set_cache_memory_policy() which is called once during process
Reviewed-by: Amber Lin
Regards,
Amber
On 3/6/25 14:52, Harish Kasiviswanathan wrote:
Set per-process static sh_mem config only once during process
initialization. Move all static changes from update_qpd() which is
called each time a queue is created to set_cache_memory_policy() which
is
From: Harish Kasiviswanathan
Define set_cache_memory_policy() for these asics and move all static
changes from update_qpd() which is called each time a queue is created
to set_cache_memory_policy() which is called once during process
initialization
Signed-off-by: Harish Kasiviswanathan
---
...
From: Harish Kasiviswanathan
Set per-process static sh_mem config only once during process
initialization. Move all static changes from update_qpd() which is
called each time a queue is created to set_cache_memory_policy() which
is called once during process initialization.
set_cache_memory_poli
From: Harish Kasiviswanathan
Add support for more per-process flags starting with option to configure
MFMA precision for gfx 9.5
v2: Change flag name to KFD_PROC_FLAG_MFMA_HIGH_PRECISION
Remove unused else condition
Signed-off-by: Harish Kasiviswanathan
Reviewed-by: Felix Kuehling
---
dr
Reviewed-by: Amber Lin
Regards,
Amber
On 2025-02-27 12:31, Jonathan Kim wrote:
Remove unused declaration of gws_debug_workaround.
Signed-off-by: Jonathan Kim
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
As far as the number of XCCs, the number of compute partitions, and the
number of memory partitions qualify, CPX is valid.
Change-Id: I65696f25e2afd75f2f4a177dabc0991b15293d9a
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 5 -
1 file changed, 4 insertions(+), 1
Reviewed-by: Amber Lin
Regards,
Amber
On 2024-11-06 11:08, Amber Lin wrote:
From: Max Erenberg
These options are necessary to use virtio devices with QEMU.
Signed-off-by: Max Erenberg
---
arch/x86/configs/rock-dbg_defconfig | 14 ++
1 file changed, 14 insertions(+)
diff
From: Max Erenberg
These options are necessary to use virtio devices with QEMU.
Signed-off-by: Max Erenberg
---
arch/x86/configs/rock-dbg_defconfig | 14 ++
1 file changed, 14 insertions(+)
diff --git a/arch/x86/configs/rock-dbg_defconfig
b/arch/x86/configs/rock-dbg_defconfig
ind
(num_xcc % adev->gmc.num_mem_partitions) == 0 is not a requirement for
CPX. It breaks NPS4/CPX support on APU.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/aqua_vanjara
Tested-by: Amber Lin
Acked-by: Amber Lin
Regards,
Amber
On 9/12/24 19:29, Alex Deucher wrote:
Need to make sure it's halted as we don't know what state
the GPU may have been left in previously.
Signed-off-by: Alex Deucher
---
drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 +
Tested-by: Amber Lin
Reviewed-by: Amber Lin
Regards,
Amber
On 9/12/24 15:24, Alex Deucher wrote:
Need to set the pipe reset and cache invalidation bits
on halt otherwise we can get stale state if the CP firmware
changes (e.g., on module unload and reload).
Signed-off-by: Alex Deucher
On GC_9.4.3, if atombios reports TMR size less than 280MB, firmware area
will be overwritten by driver or user application use. Remove !adev->bios
condition since reserve_size is initialized as 0, it'll fail into
else if (!reserve_size) condition.
Signed-off-by: Amber Lin
---
drivers
Remove unused entries in kfd_device_info table: num_xgmi_sdma_engines
and num_sdma_queues_per_engine. They are calculated in
kfd_get_num_sdma_engines and kfd_get_num_xgmi_sdma_engines instead.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 58
ix to non static function names
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 20
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 32 +++
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 3 ++
drivers/gpu/drm/amd/amdkfd/kfd_topol
_num_*_sdma_engines to global and shared by queues manager
and topology.
v3: Use gmc.xgmi.supported to justify the SDMA PCIe/XGMI assignment
Signed-off-by: Amber Lin
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 20
.../drm/amd/a
Reviewed-by: Amber Lin
On 8/20/21 3:11 PM, Mukul Joshi wrote:
Program trap handler settings to enable CWSR with software scheduler
on Aldebaran and Arcturus.
Signed-off-by: Mukul Joshi
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 1 +
drivers/gpu/drm/amd/amdgpu
reclaim inside the DQM lock creates a problematic circular
lock dependency. Therefore move free_mqd out of
destroy_queue_nocpsch_locked and call it after unlocking DQM.
Signed-off-by: Amber Lin
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 18 +-
1
-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 72c893fff61a..3cd46d7190b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
On 2020-07-23 5:41 p.m., Joshi, Mukul wrote:
[AMD Official Use Only - Internal Distribution Only]
-Original Message-
From: Lin, Amber
Sent: Thursday, July 23, 2020 5:27 PM
To: Joshi, Mukul ; amd-gfx@lists.freedesktop.org
Cc: Kuehling, Felix
Subject: Re: [PATCH v2] drm/amdkfd: Add
On 2020-07-22 12:08 p.m., Mukul Joshi wrote:
Add support for reporting thermal throttling events through SMI.
Also, add a counter to count the number of throttling interrupts
observed and report the count in the SMI event message.
Signed-off-by: Mukul Joshi
---
drivers/gpu/drm/amd/amdgpu/a
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
dr
Bump KFD ioctl after adding SMI events support
Signed-off-by: Amber Lin
---
include/uapi/linux/kfd_ioctl.h | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index ad33c18..46adbcc 100644
--- a/include/uapi
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
dr
On 2020-04-17 6:31 p.m., Felix Kuehling wrote:
Am 2020-04-17 um 4:07 p.m. schrieb Amber Lin:
When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong. This patch provides an event watch
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
dr
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
v5: move events enablement from ioctl to fd write
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
dr
write file-operation to
write an event mask (of arbitrary length if you want to enable growth
in the future). That way everything would be neatly encapsulated in
the event FD private data.
Two more comments inline ...
Am 2020-04-14 um 5:30 p.m. schrieb Amber Lin:
When the compute is mal
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
v4: support multiple clients
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
v3: send the event msg in text than in binary
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 1 +
drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 2 +
drivers/gpu/drm/amd/a
Further thinking about it, I'll use struct kfd_smi_msg_header. Instead
of using struct kfd_smi_msg_vmfault, it's a description about the event.
This way we make it generic to all events.
On 2020-04-03 9:38 a.m., Amber Lin wrote:
Thanks Felix. I'll make changes accordingly
Thanks Felix. I'll make changes accordingly but please pay attention to
my last reply inline.
On 2020-04-02 7:51 p.m., Felix Kuehling wrote:
On 2020-04-02 4:46 p.m., Amber Lin wrote:
When the compute is malfunctioning or performance drops, the system
admin
will use SMI (System Manag
NABLE/DISABLE
- correct kfifo usage
- move event message API to kfd_ioctl.h
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 3 +-
drivers/gpu/drm/amd/amdkfd/cik_event_interrupt.c | 2 +
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30
drivers/gp
multiple file descriptors referring to it. For the event
interface I think we can enforce only a single file descriptor per
device. If there is already one, your register call can fail. See more
comments inline.
On 2020-03-17 13:57, Amber Lin wrote:
When the compute is malfunctioning or
Sorry for the messed-up link. This is the link (rocm-smi-lib) which
makes use of the interface
https://github.com/RadeonOpenCompute/rocm_smi_lib
On 2020-03-23 2:19 p.m., Amber Lin wrote:
Somehow my reply didn't seem to reach the mailing list...
Hi Alex,
https://
think differently. Thanks.
Thanks.
Amber
On 2020-03-17 3:03 p.m., Alex Deucher wrote:
On Tue, Mar 17, 2020 at 1:57 PM Amber Lin wrote:
When the compute is malfunctioning or performance drops, the system admin
will use SMI (System Management Interface) tool to monitor/diagnostic what
went wrong.
, the user can use annoymous file descriptor's pull function
with wait-time specified to wait for the event to happen. Once the event
happens, the user can use read() to retrieve information related to the
event.
VM fault event is done in this patch.
Signed-off-by: Amber Lin
---
drivers/gpu/dr
created. They are removed when the queue is
destroyed.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 90 ++
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +
3 files changed
created. They are removed when the queue is
destroyed.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 9 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 96 ++
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +
3 files changed
created. They are removed when the queue is
destroyed.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 9 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 99 ++
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +
3 files changed
MD is not
enabled.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 20 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 28 -
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 28 +
3 files change
MD is not
enabled.
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 20 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 28 -
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 28 +
3 files change
After merging KFD into amdgpu, move module parameters defined in KFD to
amdgpu_drv.c, where other module parameters are declared.
v2: add kernel-doc comments
Change-Id: I2de8d6c96bb49554c028bbc84bdb194f974c9278
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 87
Since KFD is only supported by single GPU driver, it makes sense to merge
amdgpu and amdkfd into one module. This patch is the initial step: merge
Kconfig and Makefile.
v2: also remove kfd from drm Kconfig
Change-Id: I21c996ba29d393c1bf8064bdb2f5d89541159649
Signed-off-by: Amber Lin
Since KFD is only supported by single GPU driver, it makes sense to merge
amdgpu and amdkfd into one module. This patch is the initial step: merge
Kconfig and Makefile.
Change-Id: I21c996ba29d393c1bf8064bdb2f5d89541159649
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/Kconfig | 1
After amdkfd is merged to amdgpu, CONFIG_HSA_AMD_MODULE no longer exists.
Change-Id: I42096cdf887e0d776075f3dd3e8d3f153aff4e85
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 26 +++---
1 file changed, 3 insertions(+), 23 deletions(-)
diff --git a
After merging KFD into amdgpu, move module parameters defined in KFD to
amdgpu_drv.c, where other module parameters are declared.
Change-Id: I2de8d6c96bb49554c028bbc84bdb194f974c9278
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 41
from gmc to amdgpu_mman structure, and the
implementation in amdgpu_ttm_* functions
Change-Id: I56574bd544dae273da50e8b5dd6894cd5d9454bd
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 17 +
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 1 +
2 files changed
: I56574bd544dae273da50e8b5dd6894cd5d9454bd
Signed-off-by: Amber Lin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.h| 1 +
drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 7 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c| 5 +
3 files changed, 13 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
58 matches
Mail list logo