[PATCH] drm/amdgpu: add function descripion of new functions

2024-04-26 Thread Sunil Khatri
Add function description of the new functions added in amd_ip_funcs. new functions added are: a. dump_ip_state b. print_ip_state Signed-off-by: Sunil Khatri --- drivers/gpu/drm/amd/include/amd_shared.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/gpu/drm/amd/include/amd_shared.

Re: [PATCH] drm/amdgpu: fix overflowed array index read warning

2024-04-26 Thread Christian König
Am 26.04.24 um 02:27 schrieb Tim Huang: Clear overflowed array index read warning by cast operation. Signed-off-by: Tim Huang Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/a

[PATCH 1/3] drm/amd/pm: Fix negative array index read warning for pptable->DpmDescriptor

2024-04-26 Thread Jesse Zhang
Avoid using the negative values for clk_idex as an index into an array pptable->DpmDescriptor. Signed-off-by: Jesse Zhang --- .../gpu/drm/amd/pm/swsmu/smu11/navi10_ppt.c | 25 +++ 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu11

[PATCH 2/3] drm/amd/pm: fix the Out-of-bounds read warning

2024-04-26 Thread Jesse Zhang
using index i - 1U may beyond element index for mc_data[] when i = 0. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/ppatomctrl.c b/drivers/gpu/drm

[PATCH 3/3] drm/amd/pm: fix the uninitialized scalar variable warning

2024-04-26 Thread Jesse Zhang
Fix warning for using uninitialized values ​​sclk_mask, mck_mask and soc_mask. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c b/drivers/gpu

[bug report] KFENCE: use-after-free read in amdgpu_bo_move+0x1ce/0x710 [amdgpu]

2024-04-26 Thread voidastro
platform: Ryzen 5600U [520277.842817] == [520277.842821] BUG: KFENCE: use-after-free read in amdgpu_bo_move+0x1ce/0x710 [amdgpu] [520277.843054] Use-after-free read at 0x31f4f80d (in kfence-#198): [520277.843057] amdgpu_bo_

Re: [PATCH] drm/amdgpu: Fix out-of-bounds write warning

2024-04-26 Thread Christian König
Am 26.04.24 um 05:24 schrieb Ma, Jun: On 4/25/2024 8:39 PM, Christian König wrote: Am 25.04.24 um 12:00 schrieb Ma Jun: Check the ring type value to fix the out-of-bounds write warning Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 5 + 1 file changed, 5 inserti

Re: [PATCH 3/4] drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic

2024-04-26 Thread Christian König
Am 26.04.24 um 05:57 schrieb Yunxiang Li: The retry loop for SRIOV reset have refcount and memory leak issue. Depending on which function call fails it can potentially call amdgpu_amdkfd_pre/post_reset different number of times and causes kfd_locked count to be wrong. This will block all futur

RE: [PATCH] drm/amdgpu: add ACA error query support for umc_v12_0

2024-04-26 Thread Wang, Yang(Kevin)
[AMD Official Use Only - General] Please ignore this patch, Thomas will submit a new patch to replace it. Best Regards, Kevin -Original Message- From: Zhou1, Tao Sent: Friday, April 26, 2024 11:15 AM To: Wang, Yang(Kevin) ; amd-gfx@lists.freedesktop.org Cc: Zhang, Hawking ; Chai, Thomas

[PATCH 1/2] drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs

2024-04-26 Thread Lang Yu
Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pytorch/examples/tree/main/mnist. Though w

[PATCH 2/2] drm/amdkfd: Allow memory oversubscription on small APUs

2024-04-26 Thread Lang Yu
The default ttm_tt_pages_limit is 1/2 of system memory. It is prone to out of memory with such a configuration. Signed-off-by: Lang Yu --- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h | 4 ++-- drivers/gpu/drm/amd/amdgpu/amdgpu_amdkf

RE: [PATCH 3/4] drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic

2024-04-26 Thread Deng, Emily
[AMD Official Use Only - General] >-Original Message- >From: Li, Yunxiang (Teddy) >Sent: Friday, April 26, 2024 11:58 AM >To: amd-gfx@lists.freedesktop.org >Cc: Deucher, Alexander ; Koenig, Christian >; Lazar, Lijo ; Kuehling, >Felix ; Deng, Emily ; Li, >Yunxiang (Teddy) >Subject: [PATCH

[PATCH] drm/amd/pm: fix uninitialized variable warning for smu8_hwmgr

2024-04-26 Thread Tim Huang
Clear warnings that using uninitialized value level when fails to get the value from SMU. Signed-off-by: Tim Huang --- .../drm/amd/pm/powerplay/hwmgr/smu8_hwmgr.c| 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu

[PATCH] drm/amdgpu/pm: Check the return value of smum_send_msg_to_smc

2024-04-26 Thread Ma Jun
Check the return value of smum_send_msg_to_smc, otherwise we might use an uninitialized variable "now" Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr

[PATCH v3] drm/amdgpu: Fix the uninitialized variable warning

2024-04-26 Thread Ma Jun
Check the user input and phy_id value range to fix "Using uninitialized value phy_id" Signed-off-by: Ma Jun --- drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_securedisplay.c b/drivers/gpu/drm/amd/amdgp

[PATCH 1/2] drm/amd/pm: fix the uninitialized scalar variable waring

2024-04-26 Thread Jesse Zhang
Initialize variable size before calling hwmgr->hwmgr_func->iread_sensor, such as smu7_read_sensor. Signed-off-by: Jesse Zhang --- drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/gpu/drm/amd/pm/powerplay/amd_powerplay.c b/drivers/gpu/d

[PATCH 2/2] drm/amd/pm: fix uninitialized variable warning

2024-04-26 Thread Jesse Zhang
Check the return of function smum_send_msg_to_smc as it may fail to initialize the variable. Signed-off-by: Jesse Zhang --- .../drm/amd/pm/powerplay/hwmgr/smu10_hwmgr.c | 8 +-- .../drm/amd/pm/powerplay/hwmgr/smu7_hwmgr.c | 21 --- .../drm/amd/pm/powerplay/hwmgr/vega10_hw

Re: [PATCH] drm/amd/pm: fix uninitialized variable warning for smu8_hwmgr

2024-04-26 Thread Christian König
Am 26.04.24 um 11:29 schrieb Tim Huang: Clear warnings that using uninitialized value level when fails to get the value from SMU. Signed-off-by: Tim Huang Maybe drop the blank line before the "if (ret)", apart from that Reviewed-by: Christian König --- .../drm/amd/pm/powerplay/hwmgr/sm

Re: [PATCH] drm/amdgpu: add function descripion of new functions

2024-04-26 Thread Deucher, Alexander
[Public] Reviewed-by: Alex Deucher From: Sunil Khatri Sent: Friday, April 26, 2024 3:18 AM To: Deucher, Alexander ; Koenig, Christian Cc: amd-gfx@lists.freedesktop.org ; Khatri, Sunil Subject: [PATCH] drm/amdgpu: add function descripion of new functions Add

[PATCH v9 00/14] AMDGPU usermode queues

2024-04-26 Thread Shashank Sharma
This patch series introduces AMDGPU usermode queues for gfx workloads. Usermode queues is a method of GPU workload submission into the graphics hardware without any interaction with kernel/DRM schedulers. In this method, a userspace graphics application can create its own workqueue and submit it di

[PATCH v9 02/14] drm/amdgpu: add usermode queue base code

2024-04-26 Thread Shashank Sharma
This patch adds skeleton code for amdgpu usermode queue. It contains: - A new files with init functions of usermode queues. - A queue context manager in driver private data. V1: Worked on design review comments from RFC patch series: (https://patchwork.freedesktop.org/series/112214/) - Alex: Keep

[PATCH v9 01/14] drm/amdgpu: UAPI for user queue management

2024-04-26 Thread Shashank Sharma
From: Alex Deucher This patch intorduces new UAPI/IOCTL for usermode graphics queue. The userspace app will fill this structure and request the graphics driver to add a graphics work queue for it. The output of this UAPI is a queue id. This UAPI maps the queue into GPU, so the graphics app can s

[PATCH v9 06/14] drm/amdgpu: create context space for usermode queue

2024-04-26 Thread Shashank Sharma
The FW expects us to allocate at least one page as context space to process gang, process, GDS and FW related work. This patch creates a joint object for the same, and calculates GPU space offsets of these spaces. V1: Addressed review comments on RFC patch: Alex: Make this function IP specifi

[PATCH v9 04/14] drm/amdgpu: add helpers to create userqueue object

2024-04-26 Thread Shashank Sharma
This patch introduces amdgpu_userqueue_object and its helper functions to creates and destroy this object. The helper functions creates/destroys a base amdgpu_bo, kmap/unmap it and save the respective GPU and CPU addresses in the encapsulating userqueue object. These helpers will be used to create

[PATCH v9 05/14] drm/amdgpu: create MES-V11 usermode queue for GFX

2024-04-26 Thread Shashank Sharma
A Memory queue descriptor (MQD) of a userqueue defines it in the hw's context. As MQD format can vary between different graphics IPs, we need gfx GEN specific handlers to create MQDs. This patch: - Adds a new file which will be used for MES based userqueue functions targeting GFX and SDMA IP. -

[PATCH v9 03/14] drm/amdgpu: add new IOCTL for usermode queue

2024-04-26 Thread Shashank Sharma
This patch adds: - A new IOCTL function to create and destroy - A new structure to keep all the user queue data in one place. - A function to generate unique index for the queue. V1: Worked on review comments from RFC patch series: - Alex: Keep a list of queues, instead of single queue per proce

[PATCH v9 08/14] drm/amdgpu: map wptr BO into GART

2024-04-26 Thread Shashank Sharma
To support oversubscription, MES FW expects WPTR BOs to be mapped into GART, before they are submitted to usermode queues. This patch adds a function for the same. V4: fix the wptr value before mapping lookup (Bas, Christian). V5: Addressed review comments from Christian: - Either pin object

[PATCH v9 09/14] drm/amdgpu: generate doorbell index for userqueue

2024-04-26 Thread Shashank Sharma
The userspace sends us the doorbell object and the relative doobell index in the object to be used for the usermode queue, but the FW expects the absolute doorbell index on the PCI BAR in the MQD. This patch adds a function to convert this relative doorbell index to absolute doorbell index. V5: Fi

[PATCH v9 07/14] drm/amdgpu: map usermode queue into MES

2024-04-26 Thread Shashank Sharma
This patch adds new functions to map/unmap a usermode queue into the FW, using the MES ring. As soon as this mapping is done, the queue would be considered ready to accept the workload. V1: Addressed review comments from Alex on the RFC patch series - Map/Unmap should be IP specific. V2:

[PATCH v9 11/14] drm/amdgpu: fix MES GFX mask

2024-04-26 Thread Shashank Sharma
Current MES GFX mask prevents FW to enable oversubscription. This patch does the following: - Fixes the mask values and adds a description for the same. - Removes the central mask setup and makes it IP specific, as it would be different when the number of pipes and queues are different. V9: intr

[PATCH v9 14/14] drm/amdgpu: add kernel config for gfx-userqueue

2024-04-26 Thread Shashank Sharma
This patch: - adds a kernel config option "CONFIG_DRM_AMD_USERQ_GFX" - moves the usequeue initialization code for all IPs under this flag so that the userqueue works only when the config is enabled. Cc: Alex Deucher Cc: Christian Koenig Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd

[PATCH v9 10/14] drm/amdgpu: cleanup leftover queues

2024-04-26 Thread Shashank Sharma
This patch adds code to cleanup any leftover userqueues which a user might have missed to destroy due to a crash or any other programming error. V7: Added Alex's R-B V8: Rebase V9: Rebase Cc: Alex Deucher Cc: Christian Koenig Reviewed-by: Alex Deucher Suggested-by: Bas Nieuwenhuizen Signed-of

[PATCH v9 12/14] drm/amdgpu: enable SDMA usermode queues

2024-04-26 Thread Shashank Sharma
This patch does necessary modifications to enable the SDMA usermode queues using the existing userqueue infrastructure. V9: introduced this patch in the series Cc: Christian König Cc: Alex Deucher Signed-off-by: Shashank Sharma Signed-off-by: Arvind Yadav Signed-off-by: Srinivasan Shanmugam

[PATCH v9 13/14] drm/amdgpu: enable compute/gfx usermode queue

2024-04-26 Thread Shashank Sharma
From: Arvind Yadav This patch does the necessary changes required to enable compute workload support using the existing usermode queues infrastructure. Cc: Alex Deucher Cc: Christian Koenig Signed-off-by: Arvind Yadav Signed-off-by: Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu_user

Re: [PATCH] drm/amd: Only allow one entity to control ABM

2024-04-26 Thread Mario Limonciello
On 4/13/2024 03:51, Gergo Koteles wrote: Hi> ABM will reduce the backlight and compensate by adjusting brightness and contrast of the image. It has 5 levels: 0, 1, 2, 3, 4. 0 means off. 4 means maximum backlight reduction. IMO, 1 and 2 look okay. 3 and 4 can be quite impactful, both to power

[PATCH 1/3] drm/amdgpu: Add amdgpu_bo_is_vm_bo helper

2024-04-26 Thread Tvrtko Ursulin
From: Tvrtko Ursulin Help code readability by replacing a bunch of: bo->tbo.base.resv == vm->root.bo->tbo.base.resv With: amdgpu_bo_is_vm_bo(bo, vm) No functional changes. Signed-off-by: Tvrtko Ursulin --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 2 +- drivers/gpu/drm/amd/amdgpu/amdgp

[PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-26 Thread Tvrtko Ursulin
From: Tvrtko Ursulin When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs") added a new TTM region it missed to notice the conceptual imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin. That imbalance leads to such objects getting accounted against th

[PATCH 0/3] Some refactoring and maybe a memory accounting fixlet

2024-04-26 Thread Tvrtko Ursulin
From: Tvrtko Ursulin As I was reading through the driver I spotted one thing which could perhaps make it more readable (1/3), one thing which reduces some double conversions (in principle) from TTM placement back to domain (2/3), and also enables the last patch in the series which maybe fixes a s

[PATCH 2/3] drm/amdgpu: Reduce mem_type to domain double indirection

2024-04-26 Thread Tvrtko Ursulin
From: Tvrtko Ursulin All apart from AMDGPU_GEM_DOMAIN_GTT memory domains map 1:1 to TTM placements. And the former be either AMDGPU_PL_PREEMPT or TTM_PL_TT, depending on AMDGPU_GEM_CREATE_PREEMPTIBLE. Simplify a few places in the code which convert the TTM placement into a domain by checking aga

RE: [PATCH 3/4] drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic

2024-04-26 Thread Li, Yunxiang (Teddy)
[Public] > Why remove this? Oops it's a copy-paste error from the previous revision > Need to call amdgpu_virt_release_full_gpu(adev, true) before retry, and the > same as below. I thought we talked about if we call amdgpu_virt_{reset,request_full}_gpu again we don't need to release full gpu, I

[PATCH v4 2/4] drm/amdgpu: Add reset_context flag for host FLR

2024-04-26 Thread Yunxiang Li
There are other reset sources that pass NULL as the job pointer, such as amdgpu_amdkfd_reset_work. Therefore, using the job pointer to check if the FLR comes from the host does not work. Add a flag in reset_context to explicitly mark host triggered reset, and set this flag when we receive host res

[PATCH v2 3/4] drm/amdgpu: Fix amdgpu_device_reset_sriov retry logic

2024-04-26 Thread Yunxiang Li
The retry loop for SRIOV reset have refcount and memory leak issue. Depending on which function call fails it can potentially call amdgpu_amdkfd_pre/post_reset different number of times and causes kfd_locked count to be wrong. This will block all future attempts at opening /dev/kfd. The retry loop

[PATCH] drm/amdgpu: add gfx12 mqd structures

2024-04-26 Thread Alex Deucher
From: Likun Gao memory queue descriptors for gfx12. v2: squash in sdma updates (Alex) Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/include/v12_structs.h | 1188 + 1 file changed, 1188 insertions(+) create mode 1

[PATCH 2/7] drm/amdgpu: Add sdma fw v3 structure

2024-04-26 Thread Alex Deucher
From: Likun Gao Add sdma firmware struct version 3 to support sdma v7_0 firmware. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 6 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.h | 9 + 2 files changed

[PATCH 4/7] drm/amdgpu/sdma7: set sdma hang watchdog

2024-04-26 Thread Alex Deucher
From: Jack Xiao Set SDMAx_WATCHDOG_CNTL.QUEUE_HANG_COUNT registers to improve SDMA reliability. Signed-off-by: Jack Xiao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/sdma_v7_0.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/gpu/dr

[PATCH 1/7] drm/amdgpu: Add new members for sdma v7_0 fw

2024-04-26 Thread Alex Deucher
From: Likun Gao Add new members in sdma instance structure for sdma v7_0 firmware. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h | 4 1 file changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/a

[PATCH 3/7] drm/amdgpu: Add sdma v7_0 ip block support (v7)

2024-04-26 Thread Alex Deucher
From: Likun Gao v1: Add sdma v7_0 ip block support. (Likun) v2: Move vmhub from ring_funcs to ring. (Hawking) v3: Switch to AMDGPU_GFXHUB(0). (Hawking) v4: Move microcode init into early_init. (Likun) v5: Fix warnings (Alex) v6: Squash in various fixes (Alex) v7: Rebase (Alex) v8: Rebase (Alex)

[PATCH 7/7] drm/amdgpu/discovery: add sdma v7_0 ip block

2024-04-26 Thread Alex Deucher
From: Likun Gao Add sdma v7_0 ip block. v2: squash in updates (Alex) Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_discovery.c | 5 + 1 file changed, 5 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_d

[PATCH 6/7] drm/amdgpu: provide more ucode name shown via id

2024-04-26 Thread Alex Deucher
From: Likun Gao Provide some lost ucode name shown via firmware ID. v2: fix whitespace (Alex) Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 24 +++ 1 file changed, 24 insertions(+) diff --

[PATCH 5/7] drm/amdgpu: support SDMA v3 struct fw front door load

2024-04-26 Thread Alex Deucher
From: Likun Gao Add support for new SDMA firmware struct (V3) with PSP front door load type. Signed-off-by: Likun Gao Reviewed-by: Hawking Zhang Signed-off-by: Alex Deucher --- drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 10 ++ driver

[PATCH] drm/amdkfd: Flush the process wq before creating a kfd_process

2024-04-26 Thread Lancelot SIX
There is a race condition when re-creating a kfd_process for a process. This has been observed when a process under the debugger executes exec(3). In this scenario: - The process executes exec. - This will eventually release the process's mm, which will cause the kfd_process object associated

[linux-next:master] BUILD REGRESSION bb7a2467e6beef44a80a17d45ebf2931e7631083

2024-04-26 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master branch HEAD: bb7a2467e6beef44a80a17d45ebf2931e7631083 Add linux-next specific files for 20240426 Error/Warning reports: https://lore.kernel.org/oe-kbuild-all/202404262217.dt4hoodh-...@intel.com Error

Re: [PATCH] drm/amdkfd: Flush the process wq before creating a kfd_process

2024-04-26 Thread Felix Kuehling
On 2024-04-26 14:55, Lancelot SIX wrote: There is a race condition when re-creating a kfd_process for a process. This has been observed when a process under the debugger executes exec(3). In this scenario: - The process executes exec. - This will eventually release the process's mm, which wi

[pull] amdgpu, amdkfd drm-next-6.10

2024-04-26 Thread Alex Deucher
Hi Dave, Sima, More new stuff for 6.10. The following changes since commit 0208ca55aa9c9b997da1f5bc45c4e98916323f08: Backmerge tag 'v6.9-rc5' into drm-next (2024-04-22 14:35:52 +1000) are available in the Git repository at: https://gitlab.freedesktop.org/agd5f/linux.git tags/amd-drm-next-

Re: [PATCH 3/3] drm/amdgpu: Fix pinned GART area accounting and fdinfo reporting

2024-04-26 Thread Felix Kuehling
On 2024-04-26 12:43, Tvrtko Ursulin wrote: From: Tvrtko Ursulin When commit b453e42a6e8b ("drm/amdgpu: Add new placement for preemptible SG BOs") added a new TTM region it missed to notice the conceptual imbalance in GART pin size accounting as done in amdgpu_bo_pin/unpin. That imbalance lea

Re: [PATCH 2/2] drm/amdkfd: Allow memory oversubscription on small APUs

2024-04-26 Thread Felix Kuehling
On 2024-04-26 04:37, Lang Yu wrote: The default ttm_tt_pages_limit is 1/2 of system memory. It is prone to out of memory with such a configuration. Indiscriminately allowing the violation of all memory limits is not a good solution. It will lead to poor performance once you actually reach ttm_p

Re: [PATCH 1/2] drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs

2024-04-26 Thread Felix Kuehling
On 2024-04-26 04:37, Lang Yu wrote: Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pyt