Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection

2025-01-17 Thread Friedrich Vock
On 17.01.25 18:29, Michal Koutný wrote: On Thu, Jan 16, 2025 at 09:20:08AM +0100, Friedrich Vock wrote: These pools are allocated on-demand, so if a cgroup has not made any allocations for a specific device, there will be no pool corresponding to that device's memory. Here I under

Re: [PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection

2025-01-16 Thread Friedrich Vock
Hi, On 14.01.25 16:58, Michal Koutný wrote: On Tue, Jan 14, 2025 at 04:39:12PM +0100, Friedrich Vock wrote: If the current css doesn't contain any pool that is a descendant of the "pool" (i.e. when found_descendant == false), then "pool" will point to some unrelated

[PATCH] cgroup/dmem: Don't clobber pool in dmem_cgroup_calculate_protection

2025-01-14 Thread Friedrich Vock
tion. Fix this by overwriting "pool" only if it actually is a descendant of parent_pool, and setting it to NULL otherwise. Also, skip traversing subtrees if pool == NULL to avoid overwriting parent_pool (and because it's pointless). Fixes: b168ed458 ("kernel/cgroup: Add "dm

Re: [PATCH v2 0/7] kernel/cgroups: Add "dmem" memory accounting cgroup.

2024-12-18 Thread Friedrich Vock
On 17.12.24 18:37, Maarten Lankhorst wrote: Den 2024-12-17 kl. 18:11, skrev Tejun Heo: On Tue, Dec 17, 2024 at 03:28:50PM +0100, Maarten Lankhorst wrote: Now that all patches look good, what is needed to merge the series? Without patch 6/7 as it is a hack for testing. There were some questi

Re: [PATCH v2 0/7] kernel/cgroups: Add "dmem" memory accounting cgroup.

2024-12-08 Thread Friedrich Vock
like a "deviceN" possible there as well, or would device IDs look completely different? Regards, Friedrich I've created an IGT test for min and max, and found the changes from Friedrich Vock sent as feedback were needed. I've integrated those into the first patch. Maarten Lankhorst

Re: [PATCH 1/7] kernel/cgroup: Add "dev" memory accounting cgroup

2024-11-14 Thread Friedrich Vock
On 11.11.24 23:53, Maarten Lankhorst wrote: Den 2024-10-28 kl. 15:53, skrev Friedrich Vock: On 23.10.24 09:52, Maarten Lankhorst wrote: The initial version was based roughly on the rdma and misc cgroup controllers, with a lot of the accounting code borrowed from rdma. The current version is

Re: [PATCH 2/3] dma-buf: sort fences in dma_fence_unwrap_merge

2024-10-30 Thread Friedrich Vock
On 24.10.24 14:41, Christian König wrote: The merge function initially handled only individual fences and arrays which in turn were created by the merge function. This allowed to create the new array by a simple merge sort based on the fence context number. The problem is now that since the addi

Re: [PATCH 1/7] kernel/cgroup: Add "dev" memory accounting cgroup

2024-10-28 Thread Friedrich Vock
ev to cgroup.subtree_control, and then you can partition memory. Co-developed-by: Friedrich Vock Signed-off-by: Friedrich Vock Co-developed-by: Maxime Ripard Signed-off-by: Maxime Ripard Signed-off-by: Maarten Lankhorst --- Documentation/admin-guide/cgroup-v2.rst | 51 ++ Documentation/cor

Re: [PATCH 1/3] dma-buf/dma-fence_array: use kvzalloc

2024-10-24 Thread Friedrich Vock
On 24.10.24 22:29, Matthew Brost wrote: On Thu, Oct 24, 2024 at 02:41:57PM +0200, Christian König wrote: Reports indicates that some userspace applications try to merge more than 80k of fences into a single dma_fence_array leading to a warning from Really, yikes. Not really IME. Unless Chris

Re: [PATCH] dma-buf: Eliminate all duplicate fences in dma_fence_unwrap_merge

2024-10-18 Thread Friedrich Vock
Hi, On 18.10.24 10:56, Christian König wrote: Am 18.10.24 um 07:46 schrieb Friedrich Vock: When dma_fence_unwrap_merge is called on fence chains where the fences aren't ordered by context, the merging logic breaks down and we end up inserting fences twice. Doing this repeatedly leads t

[PATCH] dma-buf: Eliminate all duplicate fences in dma_fence_unwrap_merge

2024-10-17 Thread Friedrich Vock
es to merge. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3617 Signed-off-by: Friedrich Vock --- drivers/dma-buf/dma-fence-unwrap.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-fence-unwrap.c b/drivers/dma-buf/dma-fence-un

[PATCH v3 1/3] drm/amdgpu: Don't implicit sync PRT maps.

2024-08-19 Thread Friedrich Vock
From: Tatsuyuki Ishi These are considered map operations rather than unmap, and there is no point of doing implicit synchronization here. Signed-off-by: Tatsuyuki Ishi --- drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm

[PATCH v3 0/3] drm/amdgpu: Explicit sync for GEM VA operations

2024-08-19 Thread Friedrich Vock
In Vulkan, it is the application's responsibility to perform adequate synchronization before a sparse unmap, replace or BO destroy operation. This adds an option to AMDGPU_VA_OPs to disable redundant implicit sync that happens on sparse unmap or replace operations. This has seen a significant impr

[PATCH v3 3/3] drm/amdgpu: Bump amdgpu driver version.

2024-08-19 Thread Friedrich Vock
From: Tatsuyuki Ishi For detection of the new explicit sync functionality without having to try the ioctl. Signed-off-by: Tatsuyuki Ishi --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/

[PATCH v3 2/3] drm/amdgpu: Add optional explicit sync fences for GEM operations.

2024-08-19 Thread Friedrich Vock
DRM handle even between different kind of drivers (radeonsi vs radv). Signed-off-by: Tatsuyuki Ishi Co-developed-by: Friedrich Vock Signed-off-by: Friedrich Vock --- .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_csa.c | 2 +- drivers/gpu/drm/

Re: [PATCH 9/9] accel/rocket: Add IOCTLs for synchronizing memory accesses

2024-06-12 Thread Friedrich Vock
On 12.06.24 15:53, Tomeu Vizoso wrote: The NPU cores have their own access to the memory bus, and this isn't cache coherent with the CPUs. Add IOCTLs so userspace can mark when the caches need to be flushed, and also when a writer job needs to be waited for before the buffer can be accessed from

Re: [RFC 1/5] drm/amdgpu: Fix migration rate limiting accounting

2024-05-13 Thread Friedrich Vock
On 09.05.24 11:19, Tvrtko Ursulin wrote: On 08/05/2024 20:08, Friedrich Vock wrote: On 08.05.24 20:09, Tvrtko Ursulin wrote: From: Tvrtko Ursulin The logic assumed any migration attempt worked and therefore would over- account the amount of data migrated during buffer re-validation. As a

Re: [RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-05-13 Thread Friedrich Vock
Hi, On 02.05.24 16:23, Maarten Lankhorst wrote: Hey, [snip] For Xe, I've been loking at using cgroups. A small prototype is available at https://cgit.freedesktop.org/~mlankhorst/linux/log/?h=dumpcg To stimulate discussion, I've added amdgpu support as well. This should make it possible to is

Re: [RFC 1/5] drm/amdgpu: Fix migration rate limiting accounting

2024-05-08 Thread Friedrich Vock
Signed-off-by: Tvrtko Ursulin Cc: Christian König Cc: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 26 +- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c ind

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Friedrich Vock
On 25.04.24 09:15, Christian König wrote: Am 25.04.24 um 09:06 schrieb Friedrich Vock: On 25.04.24 08:58, Christian König wrote: Am 25.04.24 um 08:46 schrieb Friedrich Vock: On 25.04.24 08:32, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust

Re: [RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-25 Thread Friedrich Vock
On 25.04.24 08:25, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: This adds GTT to the "preferred domains" of this buffer object, which will also prevent any attempts at moving the buffer back to VRAM if there is space. If VRAM is full, GTT will already be c

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-25 Thread Friedrich Vock
On 25.04.24 08:58, Christian König wrote: Am 25.04.24 um 08:46 schrieb Friedrich Vock: On 25.04.24 08:32, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust buffer priorities in response to changes in application demand and memory pressure. Yeah

Re: [RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-24 Thread Friedrich Vock
On 25.04.24 08:32, Christian König wrote: Am 24.04.24 um 18:57 schrieb Friedrich Vock: Used by userspace to adjust buffer priorities in response to changes in application demand and memory pressure. Yeah, that was discussed over and over again. One big design criteria is that we can't

[RFC PATCH 17/18] drm/amdgpu: Implement EVICTED_VRAM query

2024-04-24 Thread Friedrich Vock
Used by userspace to gauge the severity of memory overcommit and make prioritization decisions based on it. Used by userspace to gauge the severity of memory overcommit and make prioritization decisions based on it. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c | 3

[RFC PATCH 16/18] drm/amdgpu: Implement SET_PRIORITY GEM op

2024-04-24 Thread Friedrich Vock
Used by userspace to adjust buffer priorities in response to changes in application demand and memory pressure. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 20 include/uapi/drm/amdgpu_drm.h | 1 + 2 files changed, 21 insertions

[RFC PATCH 11/18] drm/ttm: Bump BO priority count

2024-04-24 Thread Friedrich Vock
For adjustable priorities by userspace, it is nice to have a bit more granularity. Signed-off-by: Friedrich Vock --- include/drm/ttm/ttm_resource.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/drm/ttm/ttm_resource.h b/include/drm/ttm/ttm_resource.h index

[RFC PATCH 10/18] drm/amdgpu: Don't add GTT to initial domains after failing to allocate VRAM

2024-04-24 Thread Friedrich Vock
This adds GTT to the "preferred domains" of this buffer object, which will also prevent any attempts at moving the buffer back to VRAM if there is space. If VRAM is full, GTT will already be chosen as a fallback. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_ge

[RFC PATCH 15/18] drm/amdgpu: Set a default priority for user/kernel BOs

2024-04-24 Thread Friedrich Vock
Reserve the highest priority for the kernel, and choose a balanced value as userspace default. Userspace is intended to be able to modify these later to mark buffers as important/unimportant. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c| 1 + drivers/gpu/drm/amd

[RFC PATCH 07/18] drm/amdgpu: Add TTM uneviction control functions

2024-04-24 Thread Friedrich Vock
Try unevicting only VRAM/GTT BOs. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 50 + 1 file changed, 50 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c index 64f5001a7dc5d

[RFC PATCH 00/18] TTM interface for managing VRAM oversubscription

2024-04-24 Thread Friedrich Vock
objections/comments/questions about my proposed design? Thanks, Friedrich [1] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6833 [2] https://gitlab.freedesktop.org/pixelcluster/mesa/-/tree/spilling Friedrich Vock (18): drm/ttm: Add tracking for evicted memory drm/ttm: Add per-BO

[RFC PATCH 01/18] drm/ttm: Add tracking for evicted memory

2024-04-24 Thread Friedrich Vock
These utilities will be used to keep track of what buffers have been evicted from any particular place, to try and decide when to try undoing the eviction. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_device.c | 1 + drivers/gpu/drm/ttm/ttm_resource.c | 14

[RFC PATCH 14/18] drm/ttm: Consider BOs placed in non-favorite locations evicted

2024-04-24 Thread Friedrich Vock
If we didn't get the favorite placement because it was full, we should try moving it into the favorite placement once there is space. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 28 +++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --

[RFC PATCH 18/18] drm/amdgpu: Bump minor version

2024-04-24 Thread Friedrich Vock
Indicates support for EVICTED_VRAM queries and AMDGPU_GEM_OP_SET_PRIORITY Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu

[RFC PATCH 03/18] drm/ttm: Implement BO eviction tracking

2024-04-24 Thread Friedrich Vock
For each buffer object, remember evictions and try undoing them if memory pressure gets lower again. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 28 +++- drivers/gpu/drm/ttm/ttm_bo_util.c | 3 +++ 2 files changed, 30 insertions(+), 1 deletion

[RFC PATCH 13/18] drm/ttm: Implement ttm_bo_update_priority

2024-04-24 Thread Friedrich Vock
Used to dynamically adjust priorities of buffers at runtime, to react to changes in memory pressure/usage patterns. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 17 + include/drm/ttm/ttm_bo.h | 2 ++ 2 files changed, 19 insertions(+) diff --git a

[RFC PATCH 12/18] drm/ttm: Do not evict BOs with higher priority

2024-04-24 Thread Friedrich Vock
This makes buffer eviction significantly more stable by avoiding ping-ponging caused by low-priority buffers evicting high-priority buffers and vice versa. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 9 +++-- drivers/gpu/drm/ttm/ttm_resource.c | 5 +++-- include

[RFC PATCH 06/18] drm/ttm: Add public buffer eviction/uneviction functions

2024-04-24 Thread Friedrich Vock
For now, they are only used internally inside TTM, but this will change with the introduction of dynamic buffer priorities. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 168 ++- include/drm/ttm/ttm_bo.h | 6 ++ 2 files changed, 172

[RFC PATCH 02/18] drm/ttm: Add per-BO eviction tracking

2024-04-24 Thread Friedrich Vock
Make each buffer object aware of whether it has been evicted or not. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 1 + include/drm/ttm/ttm_bo.h | 11 +++ 2 files changed, 12 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c

[RFC PATCH 05/18] drm/ttm: Add option to evict no BOs in operation

2024-04-24 Thread Friedrich Vock
When undoing evictions because of decreased memory pressure, it makes no sense to try evicting other buffers. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ include/drm/ttm/ttm_bo.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b

[RFC PATCH 09/18] drm/amdgpu: Don't mark VRAM as a busy placement for VRAM|GTT resources

2024-04-24 Thread Friedrich Vock
We will never try evicting things from VRAM for these resources anyway. This affects TTM buffer uneviction logic, which would otherwise try to move these buffers into VRAM (clashing with VRAM-only allocations). Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu_object.c | 13

[RFC PATCH 08/18] drm/amdgpu: Don't try moving BOs to preferred domain before submit

2024-04-24 Thread Friedrich Vock
TTM now takes care of moving buffers to the best possible domain. Signed-off-by: Friedrich Vock --- drivers/gpu/drm/amd/amdgpu/amdgpu.h| 2 - drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 191 + drivers/gpu/drm/amd/amdgpu/amdgpu_cs.h | 4 - drivers/gpu/drm/amd

[RFC PATCH 04/18] drm/ttm: Add driver funcs for uneviction control

2024-04-24 Thread Friedrich Vock
Provides fine-grained control for drivers over which buffers should be considered when attempting to undo evictions. Signed-off-by: Friedrich Vock --- include/drm/ttm/ttm_device.h | 23 +++ 1 file changed, 23 insertions(+) diff --git a/include/drm/ttm/ttm_device.h b/include