Ping ...
On 2025-07-22 08:59, James Zhu wrote:
dst MIGRATE_PFN_VALID bit and src MIGRATE_PFN_MIGRATE bit
should always be set when migration success. cpage includes
src MIGRATE_PFN_MIGRATE bit set and MIGRATE_PFN_VALID bit
unset pages for both ram and vram when memory is only allocated
without
migration pages directly from copy function
-v4 correct comments and copy fucntion return mpage (suggested-by Felix)
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 72
1 file changed, 36 insertions(+), 36 deletions(-)
diff --git a/drivers/gpu/drm/amd
Hi Felix
Thanks!
Best Regadrs!
James Zhu
On 2025-07-18 18:09, Felix Kuehling wrote:
On 2025-07-14 08:46, James Zhu wrote:
dst MIGRATE_PFN_VALID bit and src MIGRATE_PFN_MIGRATE bit
should always be set when migration success. cpage includes
src MIGRATE_PFN_MIGRATE bit set and
count as
migrate_unsuccessful_pages.
-v2 use dst to check MIGRATE_PFN_VALID bit(suggested-by philip)
-v3 add warning when vram pages is less than migration pages
return migration pages directly from copy function
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 44
ess will be stuck here.
Using down_write_trylock to replace mmap_write_lock will help not block the
second and following evictiion work queue process.
-v2: just return if failed to get write lock, lets caller decides if retry.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_charde
to get migration pages. dst bit MIGRATE_PFN_VALID and src
bit MIGRATE_PFN_MIGRATE should always be set when success.
-v2 use dst to check MIGRATE_PFN_VALID bit(suggested-by philip)
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 22 ++
1 file changed
upages is assigned under cpages = 0, so it isn't really used in this function.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c
to get migration pages. When migrating pages from system to vram,
needn't check bit MIGRATE_PFN_VALID, since the system page could
be allocated, but not be accessed.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 25
1 file change
On 2025-05-09 02:00, Christian König wrote:
On 5/8/25 19:25, James Zhu wrote:
On 2025-05-08 11:20, James Zhu wrote:
On 2025-05-08 10:50, Christian König wrote:
On 5/8/25 16:46, James Zhu wrote:
When XNACK on, hang or low performance is observed with some test cases.
The restoring page
On 2025-05-08 17:54, Felix Kuehling wrote:
On 2025-05-08 10:50, Christian König wrote:
On 5/8/25 16:46, James Zhu wrote:
When XNACK on, hang or low performance is observed with some test cases.
The restoring page process has unexpected stuck during evicting/restoring
if some bo's fla
On 2025-05-08 11:20, James Zhu wrote:
On 2025-05-08 10:50, Christian König wrote:
On 5/8/25 16:46, James Zhu wrote:
When XNACK on, hang or low performance is observed with some test
cases.
The restoring page process has unexpected stuck during
evicting/restoring
if some bo's fla
On 2025-05-08 10:50, Christian König wrote:
On 5/8/25 16:46, James Zhu wrote:
When XNACK on, hang or low performance is observed with some test cases.
The restoring page process has unexpected stuck during evicting/restoring
if some bo's flag has KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED se
ess will be stuck here.
Using down_write_trylock to replace mmap_write_lock will help not block the
second and following evictiion work queue process.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/driver
before move to GTT domain.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 6 --
1 file changed, 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index 62ca12e94581..2ac6d4fa0601
On 2025-01-09 11:57, Felix Kuehling wrote:
From: Christian König
Try pinning into VRAM to allow P2P with RDMA NICs without ODP
support if all attachments can do P2P. If any attachment can't do
P2P just pin into GTT instead.
Signed-off-by: Christian König
Signed-off-by: Felix Kuehling
Revie
Reviewed-and-Tested-by: James Zhu for the series
On 2025-01-29 10:28, Christian König wrote:
Test the fences in the private dma_resv object instead of the pointer to
a potentially shared dma_resv object.
This only matters for imported BOs with an SG table since those don't
get their dma
MyQorAisinline.
Thanks!
JamesZhu
On 2025-01-08 04:18, Christian König wrote:
Am 07.01.25 um 21:01 schrieb James Zhu:
this original test condition is unclear.
No that is completely unnecessary.
The point is that with fence->ops->signaled provided the fence should
make progres
this original test condition is unclear.
Signed-off-by: James Zhu
---
drivers/gpu/drm/ttm/ttm_bo.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 48c5365efca1..d40f07802c4f 100644
--- a/drivers/gpu/drm/ttm
Reviewed-by:JamesZhufortheseries.
On 2024-09-13 04:32, Feifei Xu wrote:
Expose the interface for kfd to config sq perfmon.
Signed-off-by: Feifei Xu
Suggested-by: Hawking Zhang
Reviewed-by: Lijo Lazar
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 15 +++
drivers/gpu/drm/amd/amd
error code if migration failed.
4. Report dropped event count if fifo is full.
v3:
Simplify event drop count handling (James Zhu)
Philip Yang (4):
drm/amdkfd: Document and define SVM events message macro
drm/amdkfd: Output migrate end event if migrate failed
drm/amdkfd: Increase SMI event
On 2024-07-30 16:15, Philip Yang wrote:
Add new SMI event to report the dropped event count when the event kfifo
is full.
When the kfifo has space for two events, generate a dropped event record
to report how many events were dropped, together with the next event to
add to kfifo.
After readin
On 2024-07-30 16:15, Philip Yang wrote:
SMI event fifo size 1KB was enough to report GPU vm fault or reset
[JZ] There is a typo here. it should be NOT enough.
event, increase it to 8KB to store about 100 migrate events, less chance
to drop the migrate events if lots of migration happened in t
On 2024-07-30 16:15, Philip Yang wrote:
Document how to use SMI system management interface to enable and
receive SVM events. Document SVM event triggers.
Define SVM events message string format macro that could be used by user
mode for sscanf to parse the event. Add it to uAPI header file to ma
On 2024-05-01 18:56, Philip Yang wrote:
On system with khugepaged enabled and user cases with THP buffer, the
hmm_range_fault may takes > 15 seconds to return -EBUSY, the arbitrary
timeout value is not accurate, cause memory allocation failure.
Remove the arbitrary timeout value, return EAGAIN
On 2024-04-30 07:36, Lijo Lazar wrote:
amdxcp is a platform driver for creating partition devices. libdrm
library identifies a platform device based on 'OF_FULLNAME' or
'MODALIAS'. If two or more devices have the same platform name, drm
library only picks the first device. Platform driver core us
Ping .
Best Regards!
James Zhu
On 2024-02-06 10:58, James Zhu wrote:
PC sampling is a form of software profiling, where the threads of an application
are periodically interrupted and the program counter that the threads are
currently
attempting to execute is saved out for profiling
Since TRAPSTS.HOST_TRAP won't work pre-gfx943, so use
TTMP1 (bit 24: HT) and (bit 16-23: trapID) to identify
the host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |2 +
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2117 +
.../dr
Add a kthread to trigger pc sampling trap.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 91 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 +
2 files changed, 89 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
Add pc sampling release when process release, it will force to
stop all activate sessions with this process.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 25
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h | 1 +
drivers/gpu/drm/amd/amdkfd
KFD_RUNTIME_ENABLE_MODE_ENABLE_MASK flag on exit.
It is also not valid to have the debugger attached to a process while PC
sampling is enabled so adding some checks to prevent this.
Signed-off-by: David Yat Sin
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30
Bump the minor version to declare pc sampling feature is now
available.
Signed-off-by: James Zhu
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index ec1b6404b185
Implement trigger pc sampling trap for aldebaran.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
b/drivers/gpu/drm/amd/amdgpu
Enable pc sampling stop.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 29 ++--
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 4 +++
2 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b
the queues either waits for the waves to drain, or preempts
them with CWSR, which itself executes a trap and waits for previous traps
to finish.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +++
drivers/gpu/drm/amd/amdkfd
Add setting trap pc sampling flag.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 13 +
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd
Add interface to trigger pc sampling trap.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 7 +++
1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index
From: David Yat Sin
Enable pc sampling create.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 59 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10
2 files changed, 68 insertions
Enable host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 63 +++
.../drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 24 ---
2 files changed, 52 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
b
Add trace_id return for new pc sampling creation per device,
Use IDR to quickly locate pc_sampling_entry for reference.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +++-
drivers/gpu/drm/amd
1st level TMA's 2nd byte which used for trap type setting,
to use bit operation to change selected bit only.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
Enable pc sampling start.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 27 +---
1 file changed, 24 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index
Implement trigger pc sampling trap for arcturus.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c| 14 +-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
b/drivers/gpu/drm/amd/amdgpu
From: David Yat Sin
Add pc sampling support in kfd_ioctl.
The user mode code which uses this new kfd_ioctl is linked to
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
with master branch.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
include
Enable pc sampling destroy.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index
Implement trigger pc sampling trap for gfx v9.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 36 +++
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 7
2 files changed, 43 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Before fire a new host trap, check the host trap status.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 35 +++
.../amd/include/asic_reg/gc/gc_9_0_offset.h | 2 ++
.../amd/include/asic_reg/gc/gc_9_0_sh_mask.h | 5 +++
3 files changed, 42
Check pcs_entry valid for pc sampling ioctl.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 33 ++--
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd
Add pc sampling mutex per node, and do init/destroy in node init.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 12
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +++
2 files changed, 19 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
From: David Yat Sin
Enable pc sampling to query system capability.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 65 +++-
1 file changed, 64 insertions(+), 1 deletion(-)
diff --git a
From: David Yat Sin
Add pc sampling functions in amdkfd.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 3 +-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 45 +++
drivers/gpu/drm/amd/amdkfd
: add pc sampling support
drm/amdkfd: enable pc sampling query
drm/amdkfd: enable pc sampling create
drm/amdkfd: Set debug trap bit when enabling PC Sampling
James Zhu (19):
drm/amdkfd: add pc sampling mutex
drm/amdkfd: add trace_id return
drm/amdkfd: check pcs_entry valid
drm/amdkfd
On 2024-01-08 03:12, Christian König wrote:
Am 02.01.24 um 21:56 schrieb James Zhu:
Current AMDGPU_VM_RESERVED_VRAM is updated to 8M.
Signed-off-by: James Zhu
Maybe remove the value completely from the comment, just something
like "How much memory be reserved for page tables".
Current AMDGPU_VM_RESERVED_VRAM is updated to 8M.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index b6cd565562ad
On 2023-12-15 10:59, James Zhu wrote:
From: David Yat Sin
We need the SPI_GDBG_PER_VMID_CNTL.TRAP_EN bit to be set during PC
Sampling so that the TTMP registers are valid inside the sampling data.
runtime_info.ttmp_setup will be cleared when the user application
does the
From: David Yat Sin
We need the SPI_GDBG_PER_VMID_CNTL.TRAP_EN bit to be set during PC
Sampling so that the TTMP registers are valid inside the sampling data.
runtime_info.ttmp_setup will be cleared when the user application
does the AMDKFD_IOC_RUNTIME_ENABLE ioctl without
KFD_RUNTIME_ENABLE_MODE
Add setting trap pc sampling flag.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 13 +
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd
Enable pc sampling stop.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 4 +++
2 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b
Add pc sampling release when process release, it will force to
stop all activate sessions with this process.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 21
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h | 1 +
drivers/gpu/drm/amd/amdkfd
Enable host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 63 +++
.../drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 24 ---
2 files changed, 52 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
b
Before fire a new host trap, check the host trap status.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 35 +++
.../amd/include/asic_reg/gc/gc_9_0_offset.h | 2 ++
.../amd/include/asic_reg/gc/gc_9_0_sh_mask.h | 5 +++
3 files changed, 42
Bump the minor version to declare pc sampling feature is now
available.
Signed-off-by: James Zhu
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 1bd1347effea
Since TRAPSTS.HOST_TRAP won't work pre-gfx943, so use
TTMP1 (bit 24: HT) and (bit 16-23: trapID) to identify
the host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |2 +
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2117 +
.../dr
Add pc sampling mutex per node, and do init/destroy in node init.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 12
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 7 +++
2 files changed, 19 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
Enable pc sampling start.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 26 +---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index
Implement trigger pc sampling trap for aldebaran.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
b/drivers/gpu/drm/amd/amdgpu
the queues either waits for the waves to drain, or preempts
them with CWSR, which itself executes a trap and waits for previous traps
to finish.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +++
drivers/gpu/drm/amd/amdkfd
Add a kthread to trigger pc sampling trap.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 68 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 +
2 files changed, 68 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
1st level TMA's 2nd byte which used for trap type setting,
to use bit operation to change selected bit only.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
Add interface to trigger pc sampling trap.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/include/kgd_kfd_interface.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 6d094cf3587d
Implement trigger pc sampling trap for gfx v9.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c | 36 +++
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.h | 7
2 files changed, 43 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu
Check pcs_entry valid for pc sampling ioctl.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 33 ++--
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd
Implement trigger pc sampling trap for arcturus.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c| 14 +-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_arcturus.c
b/drivers/gpu/drm/amd/amdgpu
Enable pc sampling destroy.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index
: add pc sampling support
drm/amdkfd: enable pc sampling query
drm/amdkfd: enable pc sampling create
drm/amdkfd: set debug trap bit when enabling PC Sampling
James Zhu (19):
drm/amdkfd: add pc sampling mutex
drm/amdkfd: add trace_id return
drm/amdkfd: check pcs_entry valid
drm/amdkfd
Add trace_id return for new pc sampling creation per device,
Use IDR to quickly locate pc_sampling_entry for reference.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device.c | 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 20 +++-
drivers/gpu/drm/amd
From: David Yat Sin
Enable pc sampling to query system capability.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 54 +++-
1 file changed, 53 insertions(+), 1 deletion(-)
diff --git a
From: David Yat Sin
Add pc sampling functions in amdkfd.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/Makefile | 3 +-
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +++
drivers/gpu/drm/amd/amdkfd
From: David Yat Sin
Enable pc sampling create.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 53 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10
2 files changed, 62 insertions
From: David Yat Sin
Add pc sampling support in kfd_ioctl.
The user mode code which uses this new kfd_ioctl is linked to
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface
with master branch.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
include
On 2023-12-13 11:23, Felix Kuehling wrote:
On 2023-12-13 10:24, James Zhu wrote:
Ping ...
On 2023-12-08 18:01, James Zhu wrote:
When application tries to allocate all system memory and cause memory
to swap out. Needs more time for hmm_range_fault to validate the
remaining page for
/amdkfd: enable pc sampling query
From: David Yat Sin
Enable pc sampling to query system capability.
Co-developed-by: James Zhu
Signed-off-by: James Zhu
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 54
+++-
1 file changed, 53 insertions(+), 1
Ping ...
On 2023-12-08 18:01, James Zhu wrote:
When application tries to allocate all system memory and cause memory
to swap out. Needs more time for hmm_range_fault to validate the
remaining page for allocation. To be safe, increase timeout value to
1 second for 64MB range.
Signed-off-by
Only schedule when hmm_range_fault returns error.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
b/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
index b24eb5821fd1
/zhums/ROCT-Thunk-Interface/tree/zhums/ROCT-Thunk.
David Yat Sin (4):
drm/amdkfd/kfd_ioctl: add pc sampling support
drm/amdkfd: add pc sampling support
drm/amdkfd: enable pc sampling query
drm/amdkfd: enable pc sampling create
James Zhu (19):
drm/amdkfd: add pc sampling mutex
drm/amdkfd
Check pcs_entry valid for pc sampling ioctl.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 33 ++--
1 file changed, 30 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd
Ping ...
On 2023-12-07 17:53, James Zhu wrote:
PC sampling is a form of software profiling, where the threads of an application
are periodically interrupted and the program counter that the threads are
currently
attempting to execute is saved out for profiling.
David Yat Sin (4):
drm
On 2023-12-11 05:38, Christian König wrote:
Am 09.12.23 um 00:01 schrieb James Zhu:
Needn't do schedule for each hmm_range_fault, and use cond_resched
to replace schedule.
cond_resched() is usually NAKed upstream since it is a NO-OP in most
situations.
[JZ] then let me change ba
When application tries to allocate all system memory and cause memory
to swap out. Needs more time for hmm_range_fault to validate the
remaining page for allocation. To be safe, increase timeout value to
1 second for 64MB range.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu
Needn't do schedule for each hmm_range_fault, and use cond_resched
to replace schedule.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_hmm.c
b/drivers/gpu/drm/amd/a
Add pc sampling release when process release, it will force to
stop all activate sessions with this process.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 21
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.h | 1 +
drivers/gpu/drm/amd/amdkfd
Bump the minor version to declare pc sampling feature is now
available.
Signed-off-by: James Zhu
---
include/uapi/linux/kfd_ioctl.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index 1bd1347effea
Add a kthread to trigger pc sampling trap.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 68 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 1 +
2 files changed, 68 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
Since TRAPSTS.HOST_TRAP won't work pre-gfx943, so use
TTMP1 (bit 24: HT) and (bit 16-23: trapID) to identify
the host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v9.c |2 +
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 2117 +
.../dr
the queues either waits for the waves to drain, or preempts
them with CWSR, which itself executes a trap and waits for previous traps
to finish.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c | 11 +++
drivers/gpu/drm/amd/amdkfd
1st level TMA's 2nd byte which used for trap type setting,
to use bit operation to change selected bit only.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 16 +---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/a
Enable pc sampling stop.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 28 +---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 4 +++
2 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b
Implement trigger pc sampling trap for aldebaran.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c | 11 +++
1 file changed, 11 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_aldebaran.c
b/drivers/gpu/drm/amd/amdgpu
Add setting trap pc sampling flag.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 13 +
2 files changed, 15 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
b/drivers/gpu/drm/amd/amdkfd
Enable pc sampling start.
Signed-off-by: James Zhu
---
drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c | 26 +---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
b/drivers/gpu/drm/amd/amdkfd/kfd_pc_sampling.c
index
Enable host trap.
Signed-off-by: James Zhu
---
.../gpu/drm/amd/amdkfd/cwsr_trap_handler.h| 63 +++
.../drm/amd/amdkfd/cwsr_trap_handler_gfx9.asm | 24 ---
2 files changed, 52 insertions(+), 35 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/cwsr_trap_handler.h
b
1 - 100 of 565 matches
Mail list logo