On 2025-12-16 17:27, Chen, Xiaogang wrote:


Can this patch be stand alone ? I think it needs be combined with patch 6.

This patch alloc MQD on VRAM, without GART mapping, FW will access MQD via fb aperture address, with mtype UC. This works fine then patch 6 add GART mapping with mtype RW to improve performance.

On 12/15/2025 10:56 AM, Philip Yang wrote:
To reduce queue switch latency further, move MQD to VRAM domain,
CP access MQD and control stack via FB aperture, this requires
contiguous pages.

Signed-off-by: Philip Yang<[email protected]>
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c      | 3 ++-
  drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c | 2 +-
  2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 090d17911bc4..113c058cf7b5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -329,7 +329,8 @@ int amdgpu_amdkfd_alloc_kernel_mem(struct amdgpu_device 
*adev, size_t size,
        bp.size = size;
        bp.byte_align = PAGE_SIZE;
        bp.domain = domain;
-       bp.flags = AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+       bp.flags = AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
+                  AMDGPU_GEM_CREATE_CPU_GTT_USWC;

Should bp.flags setting depend on domain type: when domain is AMDGPU_GEM_DOMAIN_VRAM then bp.flags |= AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS?

AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS only used by VRAM buddy allocator, we combine GTT and VRAM allocation flags in other places too, use domain to decide the mm. Regards, Philip

Regards

Xiaogang

        bp.type = ttm_bo_type_kernel;
        bp.resv = NULL;
        bp.bo_ptr_size = sizeof(struct amdgpu_bo);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
index d234db138182..14123e1a9716 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c
@@ -139,7 +139,7 @@ static struct kfd_mem_obj *allocate_mqd(struct kfd_node 
*node,
                        (ALIGN(q->ctl_stack_size, PAGE_SIZE) +
                        ALIGN(sizeof(struct v9_mqd), PAGE_SIZE)) *
                        NUM_XCC(node->xcc_mask),
-                       AMDGPU_GEM_DOMAIN_GTT,
+                       AMDGPU_GEM_DOMAIN_VRAM,
                        &(mqd_mem_obj->mem),
                        &(mqd_mem_obj->gpu_addr),
                        (void *)&(mqd_mem_obj->cpu_ptr), true);

Reply via email to