NV10 mask used for gfx12. Fix it.
Put back DCC flag and default mtype to MTYPE_NC.
Fixes: b8c76c59987a ("drm/amdgpu: rework how PTE flags are generated")
Suggested-by: Felix Kuehling
Co-authored-by: Harish Kasiviswanathan
Signed-off-by: David Yat Sin
Signed-off-by: Harish Kasiv
Add support for checkpoint/restore for SDMA queues of type
KFD_QUEUE_TYPE_SDMA_BY_ENG_ID.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 9 +
2 files changed, 10 insertions(+)
diff
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and
restores all the MQDs within the partition.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
.../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 61 ---
.../amd/amdkfd
GPUs with multi-xcc have multiple MQDs per queue. This patch saves and
restores all the MQDs within the partition.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 2 +-
.../gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 61 ---
.../amd/amdkfd
Add support for checkpoint/restore for SDMA queues of type
KFD_QUEUE_TYPE_SDMA_BY_ENG_ID.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 +
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 8
2 files changed, 9 insertions(+)
diff
Set memory mtype to UC host memory when ext-coherent
flag is set and memory is registered as a SVM allocation.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
b
If queue size is less than minimum, clamp it to minimum to prevent
underflow when writing queue mqd.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 4
include/uapi/linux/kfd_ioctl.h | 2 ++
2 files changed, 6 insertions(+)
diff --git a/drivers/gpu
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve
bitfields that do not need to be modified as they contain flags to
track queue states that are used by CP FW.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 3 ++-
drivers/gpu/drm/amd/amdkfd
When userspace applications call AMDKFD_IOC_UPDATE_QUEUE. Preserve
bitfields that do not need to be modified as they contain flags to track
queue states that are used by CP FW.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v10.c | 4 +++-
drivers/gpu/drm/amd/amdkfd
Cacheline size is not available in IP discovery for gc943,gc944.
Signed-off-by: David Yat Sin
Reviewed-by: Harish Kasiviswanathan
---
drivers/gpu/drm/amd/amdkfd/kfd_crat.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_crat.c
b/drivers/gpu/drm/amd
Fixes issue where user events of type KFD_EVENT_TYPE_HW_EXCEPTION do not
have valid data
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 4
1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
b/drivers/gpu/drm/amd/amdkfd
Change local memory type to MTYPE_UC on revision id 0
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +--
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 7 +--
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b
Change local memory type to MTYPE_UC on revision id 0
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 7 +--
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 8 +---
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
b
Change local memory type on gfx943 to MTYPE_UC on revision id 0
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 5 -
drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 8 +---
2 files changed, 9 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu
returned by this new ioctl is guaranteed to
succeed, barring races with other allocating tasks.
This IOCTL will be used by libhsakmt:
https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
Signed-off-by: Daniel Phillips
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu
returned by this new ioctl is guaranteed to
succeed, barring races with other allocating tasks.
This IOCTL will be used by libhsakmt:
https://www.mail-archive.com/amd-gfx@lists.freedesktop.org/msg75743.html
Signed-off-by: Daniel Phillips
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu
returned by this new ioctl is guaranteed to
succeed, barring races with other allocating tasks.
Signed-off-by: Daniel Phillips
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h| 1 +
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 37 +--
drivers/gpu/drm
Adding support to checkpoint/restore GWS (Global Wave Sync) queues.
Signed-off-by: David Yat Sin
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +++---
2 files changed, 8 insertions
dqm->gws_queue_count and pdd->qpd.mapped_gws_queue need to be updated
each time the queue gets evicted.
Fixes: b8020b0304c8 ("drm/amdkfd: Enable over-subscription with >1 GWS queue")
Signed-off-by: David Yat Sin
Reviewed-by: Felix Kuehling
---
.../drm/amd/amdkfd/kfd_dev
Adding support to checkpoint/restore GWS(Global Wave Sync) queues.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 10 +++---
2 files changed, 8 insertions(+), 4 deletions(-)
diff
dqm->gws_queue_count and pdd->qpd.mapped_gws_queue needs to be updated
each time the queue gets evicted.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 83 +--
1 file changed, 37 insertions(+), 46 deletions(-)
diff --git a/drivers/gpu/d
Adding support to checkpoint/restore GWS(Global Wave Sync) queues.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 4 ++--
.../amd/amdkfd/kfd_process_queue_manager.c| 22 ++-
3 files
Queue can be inactive during process termination. This would cause
dqm->gws_queue_count to not be decremented. There can only be 1 GWS
queue per device process so moving the logic out of loop.
Signed-off-by: David Yat Sin
---
.../gpu/drm/amd/amdkfd/kfd_device_queue_manager.c|
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 6 ++
include/uapi/linux/kfd_ioctl.h | 2 ++
2 files changed, 8 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index e1e2362841f8
Set dmabuf handle to invalid for BOs that cannot be accessed using SDMA
during checkpoint/restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 8 ++--
include/uapi/linux/kfd_ioctl.h | 2 ++
2 files changed, 8 insertions(+), 2 deletions(-)
diff
Export dmabuf handles for GTT BOs so that their contents can be accessed
using SDMA during checkpoint/restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12
include/uapi/linux/kfd_ioctl.h | 3 ++-
2 files changed, 10 insertions(+), 5
Export dmabuf handles for GTT BOs so that their contents can be accessed
using SDMA during checkpoint/restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd
When the process is getting restored, the queues are not mapped yet, so
there is no VMID assigned for this process and no TLBs to flush.
Signed-off-by: David Yat Sin
Reviewed-by: Felix Kuehling
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 30 +---
1 file changed, 1
Refactor CRIU restore BO to reduce identation before adding support for
IPC.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 271 +++
1 file changed, 129 insertions(+), 142 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
b
Fix for possible integer overflow when doing addition.
Reported-by: Dan Carpenter
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 60 +--
1 file changed, 41 insertions(+), 19 deletions(-)
diff --git a/drivers
#x27;s.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 416 ---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 5 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 9 +
drivers/gpu/drm/amd/amdkfd/kfd_process.c |
When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2
Dump contents of queue control stacks on CRIU dump and restore them
during CRIU restore.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
.../drm/amd/amdkfd
Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 61 +
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 322 +--
drivers/gpu/drm/amd/a
Dump contents of queue MQD's on CRIU dump and restore them during CRIU
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
.../drm/amd/amdkfd/kfd_device_queue_manager.c
Introducing pause IOCTL. The CRIU amdgpu plugin is needs
to call AMDKFD_IOC_CRIU_PAUSE(pause = 1) before starting dump and
AMDKFD_IOC_CRIU_PAUSE(pause = 0) when dump is complete. This ensures
that the queues are not modified between each CRIU dump ioctl.
Signed-off-by: David Yat Sin
---
drivers
further process
the sdma command submissions.
With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.
Suggested-by: Felix Kuehling
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd
Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h
When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 +-
.../amd/amdkfd
during a restore operation.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 +++
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h | 2 +
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 188 ++-
drivers/gpu/drm/amd/amdkfd
-Restoring on a different system
V1: Initial
V2: Addressed review comments
V3: Rebased on latest amd-staging-drm-next
PS: There will be an upcoming V4 patch series with minor additions to the API's
to support HMM.
David Yat Sin (9):
drm/amdkfd: CRIU Implement KFD pause ioctl
drm/amdkfd: CRI
From: Rajneesh Bhardwaj
- Update debug config for Checkpoint-Restore (CR) support
- Also include necessary options for CR with docker containers.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
arch/x86/configs/rock-dbg_defconfig | 53 ++---
1 file
criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 297 ++-
1 file
stage-4 of the
restore process i.e. criu_resume ioctl is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.
Signed-off-by: David Yat Sin
Signed-off
h has elevated ptrace
attached privileges and CAP_SYS_ADMIN capabilities attached with the
file descriptors so modify KFD to allow such calls.
(API redesigned by David Yat Sin)
Suggested-by: Felix Kuehling
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/a
id of CRIU dumper process. Also the pid of a process
inside a container might be different than its global pid so return
the ns pid.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +++-
drivers/gpu/drm/amd/a
further process
the sdma command submissions.
With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.
Suggested-by: Felix Kuehling
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd
Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h
Dump contents of queue MQD's on CRIU dump and restore them during CRIU
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
.../drm/amd/amdkfd/kfd_device_queue_manager.c
during a restore operation.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 1f114a541bd21873de905db64bb9efa673274d4b)
(cherry picked from commit 20c435fad57d3201e5402e38ae778f1f0f84a09d)
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 +++
drivers/gpu
stage-4 of the
restore process i.e. criu_resume ioctl is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.
Signed-off-by: David Yat Sin
Signed-off
criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 47bb685701c336d1fde7e91be93d9cabe89a4c1b)
(cherry picked
When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 +-
.../amd/amdkfd
When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2
From: Rajneesh Bhardwaj
Update rock-rel_defconfig for monolithic kernel release that enables
CRIU support with kfd.
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 4a6d309a82648a23a4fc0add83013ac6db6187d5)
Signed-off-by: David Yat Sin
---
arch/x86/configs/rock-rel_defconfig | 13
From: Rajneesh Bhardwaj
This reverts commit 12ebe2b9df192a2a8580cd9ee3e9940c116913c8.
This is just a temporary work around and will be dropped later.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 +++
1 file changed, 7
h has elevated ptrace
attached privileges and CAP_SYS_ADMIN capabilities attached with the file
descriptors so modify KFD to allow such calls.
(API redesign suggested by Felix Kuehling and implemented by David Yat
Sin)
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from c
Dump contents of queue control stacks on CRIU dump and restore them
during CRIU restore.
(rajneesh: rebased to 5.11 and fixed merge conflict)
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 2 +-
drivers/gpu/drm/amd/amdkfd
When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 61 +--
1 file changed, 41 insertions(+), 20 deletions(-)
diff --git a/drivers
41a47)
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14
3 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/dr
Introducing pause IOCTL. The CRIU amdgpu plugin is needs
to call AMDKFD_IOC_CRIU_PAUSE(pause = 1) before starting dump and
AMDKFD_IOC_CRIU_PAUSE(pause = 0) when dump is complete. This ensures
that the queues are not modified between each CRIU dump ioctl.
Signed-off-by: David Yat Sin
---
drivers
Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 61 +
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 322 +--
drivers/gpu/drm/amd/a
#x27;s.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 409 ---
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 5 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10 +
drivers/gpu/drm/amd/amdkfd/kfd_process.c |
From: Rajneesh Bhardwaj
- Update debug config for Checkpoint-Restore (CR) support
- Also include necessary options for CR with docker containers.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
arch/x86/configs/rock-dbg_defconfig | 53 ++---
1 file
different system
V2: Addressed review comments
David Yat Sin (9):
drm/amdkfd: CRIU Implement KFD pause ioctl
drm/amdkfd: CRIU add queues support
drm/amdkfd: CRIU restore queue ids
drm/amdkfd: CRIU restore sdma id for queues
drm/amdkfd: CRIU restore queue doorbell id
drm/amdkfd: CRIU dump
further process
the sdma command submissions.
With sDMA, we see huge improvement in checkpoint and restore operations
compared to the generic pci based access via host data path.
Suggested-by: Felix Kuehling
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd
#x27;s.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 400 +--
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 5 +-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 10 +
drivers/gpu/drm/amd/amdkfd/kfd_process.c |
Dump contents of queue control stacks on CRIU dump and restore them
during CRIU restore.
(rajneesh: rebased to 5.11 and fixed merge conflict)
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 31 ---
drivers/gpu
Dump contents of queue MQD's on CRIU dump and restore them during CRIU
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 53 ++
drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c | 2 +-
.../drm/amd/amdkfd/kfd_device_queue_manager.c
From: Rajneesh Bhardwaj
This reverts commit 12ebe2b9df192a2a8580cd9ee3e9940c116913c8.
This is just a temporary work around and will be dropped later.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 7 +++
1 file changed, 7
Add support to existing CRIU ioctl's to save and restore events during
criu checkpoint and restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 130 +++-
drivers/gpu/drm/amd/amdkfd/kfd_events.c | 253 ---
drivers/gpu/drm/amd/a
When re-creating queues during CRIU restore, restore the queue with the
same doorbell id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 61 +--
1 file changed, 41 insertions(+), 20 deletions(-)
diff --git a/drivers
When re-creating queues during CRIU restore, restore the queue with the
same queue id value used during CRIU dump. Adding a new private
structure queue_restore_data to store queue restore information.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd
When re-creating queues during CRIU restore, restore the queue with the
same sdma id value used during CRIU dump.
Signed-off-by: David Yat Sin
---
.../drm/amd/amdkfd/kfd_device_queue_manager.c | 48 ++-
.../drm/amd/amdkfd/kfd_device_queue_manager.h | 3 +-
.../amd/amdkfd
Add support to existing CRIU ioctl's to save number of queues and queue
properties for each queue during checkpoint and re-create queues on
restore.
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 380 ++-
drivers/gpu/drm/amd/amdkfd/kfd_p
Introducing pause IOCTL. The CRIU amdgpu plugin is needs
to call AMDKFD_IOC_CRIU_PAUSE(pause = 1) before starting dump and
AMDKFD_IOC_CRIU_PAUSE(pause = 0) when dump is complete. This ensures
that the queues are not modified between each CRIU dump ioctl.
Signed-off-by: David Yat Sin
---
drivers
stage-4 of the
restore process i.e. criu_resume ioctl is received, and the process is
ready to be resumed. This ioctl is different from other KFD CRIU ioctls
since its called by CRIU master restore process for all the target
processes being resumed by CRIU.
Signed-off-by: David Yat Sin
Signed-off
criu process, attach old IDR values to newly
created BOs. This also adds the minimal gpu mapping support for a single
gpu checkpoint restore use case.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 47bb685701c336d1fde7e91be93d9cabe89a4c1b)
(cherry picked
From: Rajneesh Bhardwaj
- Update debug config for Checkpoint-Restore (CR) support
- Also include necessary options for CR with docker containers.
Signed-off-by: Rajneesh Bhardwaj
Signed-off-by: David Yat Sin
---
arch/x86/configs/rock-dbg_defconfig | 53 ++---
1 file
during a restore operation.
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 1f114a541bd21873de905db64bb9efa673274d4b)
(cherry picked from commit 20c435fad57d3201e5402e38ae778f1f0f84a09d)
---
drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 20 +++
drivers/gpu
h has elevated ptrace
attached privileges and CAP_SYS_ADMIN capabilities attached with the file
descriptors so modify KFD to allow such calls.
(API redesign suggested by Felix Kuehling and implemented by David Yat
Sin)
Signed-off-by: David Yat Sin
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from c
From: Rajneesh Bhardwaj
Update rock-rel_defconfig for monolithic kernel release that enables
CRIU support with kfd.
Signed-off-by: Rajneesh Bhardwaj
(cherry picked from commit 4a6d309a82648a23a4fc0add83013ac6db6187d5)
Signed-off-by: David Yat Sin
---
arch/x86/configs/rock-rel_defconfig | 13
41a47)
Signed-off-by: David Yat Sin
---
drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 44 +++-
drivers/gpu/drm/amd/amdkfd/kfd_priv.h| 2 ++
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 14
3 files changed, 59 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/dr
different system
David Yat Sin (9):
drm/amdkfd: CRIU Implement KFD pause ioctl
drm/amdkfd: CRIU add queues support
drm/amdkfd: CRIU restore queue ids
drm/amdkfd: CRIU restore sdma id for queues
drm/amdkfd: CRIU restore queue doorbell id
drm/amdkfd: CRIU dump and restore queue mqds
drm/amdkfd
85 matches
Mail list logo