On Tue, Jul 15, 2025 at 5:34 PM Christian König <christian.koe...@amd.com> wrote: > > > > On 14.07.25 07:18, Dave Airlie wrote: > > From: Dave Airlie <airl...@redhat.com> > > > > This enables all the backend code to use the list lru in memcg mode, > > and set the shrinker to be memcg aware. > > > > It adds the loop case for when pooled pages end up being reparented > > to a higher memcg group, that newer memcg can search for them there > > and take them back. > > > > Signed-off-by: Dave Airlie <airl...@redhat.com> > > > > --- > > v2: just use the proper stats. > > --- > > drivers/gpu/drm/ttm/ttm_pool.c | 126 ++++++++++++++++++++++++++------- > > 1 file changed, 102 insertions(+), 24 deletions(-) > > > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c b/drivers/gpu/drm/ttm/ttm_pool.c > > index a4f4e27c6a1f..1e6da2cc1f06 100644 > > --- a/drivers/gpu/drm/ttm/ttm_pool.c > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > > @@ -142,7 +142,9 @@ static int ttm_pool_nid(struct ttm_pool *pool) { > > } > > > > /* Allocate pages of size 1 << order with the given gfp_flags */ > > -static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t > > gfp_flags, > > +static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, > > + struct obj_cgroup *objcg, > > + gfp_t gfp_flags, > > unsigned int order) > > { > > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; > > @@ -162,7 +164,10 @@ static struct page *ttm_pool_alloc_page(struct > > ttm_pool *pool, gfp_t gfp_flags, > > p = alloc_pages_node(pool->nid, gfp_flags, order); > > if (p) { > > p->private = order; > > - mod_node_page_state(NODE_DATA(page_to_nid(p)), > > NR_GPU_ACTIVE, (1 << order)); > > + if (!mem_cgroup_charge_gpu_page(objcg, p, order, > > gfp_flags, false)) { > > Thinking more about it that is way to late. At this point we can't fail the > allocation any more. >
I've tested it at least works, but there is a bit of a problem with it, because if we fail a 10 order allocation, it tries to fallback down the order hierarchy, when there is no point since it can't account the maximum size. > Otherwise we either completely break suspend or don't account system > allocations to the correctly any more after resume. When you say suspend here, do you mean for VRAM allocations, normal system RAM allocations which are accounted here shouldn't have any effect on suspend/resume since they stay where they are. Currently it also doesn't try account for evictions at all. > > What we need is to reserve the memory on BO allocation and commit it when the > TT backend is populated. I'm not sure what reserve vs commit is here, mem cgroup is really just reserve until you can reserve no more, it's just a single charge/uncharge stage. If we try and charge and we are over the limit, bad things will happen, either fail allocation or reclaim for the cgroup. Regards, Dave.