> > On 14.07.25 07:18, Dave Airlie wrote: > > > From: Dave Airlie <airl...@redhat.com> > > > > > > This enables all the backend code to use the list lru in memcg mode, > > > and set the shrinker to be memcg aware. > > > > > > It adds the loop case for when pooled pages end up being reparented > > > to a higher memcg group, that newer memcg can search for them there > > > and take them back. > > > > > > Signed-off-by: Dave Airlie <airl...@redhat.com> > > > > > > --- > > > v2: just use the proper stats. > > > --- > > > drivers/gpu/drm/ttm/ttm_pool.c | 126 ++++++++++++++++++++++++++------- > > > 1 file changed, 102 insertions(+), 24 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c > > > b/drivers/gpu/drm/ttm/ttm_pool.c > > > index a4f4e27c6a1f..1e6da2cc1f06 100644 > > > --- a/drivers/gpu/drm/ttm/ttm_pool.c > > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c > > > @@ -142,7 +142,9 @@ static int ttm_pool_nid(struct ttm_pool *pool) { > > > } > > > > > > /* Allocate pages of size 1 << order with the given gfp_flags */ > > > -static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t > > > gfp_flags, > > > +static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, > > > + struct obj_cgroup *objcg, > > > + gfp_t gfp_flags, > > > unsigned int order) > > > { > > > unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS; > > > @@ -162,7 +164,10 @@ static struct page *ttm_pool_alloc_page(struct > > > ttm_pool *pool, gfp_t gfp_flags, > > > p = alloc_pages_node(pool->nid, gfp_flags, order); > > > if (p) { > > > p->private = order; > > > - mod_node_page_state(NODE_DATA(page_to_nid(p)), > > > NR_GPU_ACTIVE, (1 << order)); > > > + if (!mem_cgroup_charge_gpu_page(objcg, p, order, > > > gfp_flags, false)) { > > > > Thinking more about it that is way to late. At this point we can't fail the > > allocation any more. > > > > I've tested it at least works, but there is a bit of a problem with > it, because if we fail a 10 order allocation, it tries to fallback > down the order hierarchy, when there is no point since it can't > account the maximum size. > > > Otherwise we either completely break suspend or don't account system > > allocations to the correctly any more after resume. > > When you say suspend here, do you mean for VRAM allocations, normal > system RAM allocations which are accounted here shouldn't have any > effect on suspend/resume since they stay where they are. Currently it > also doesn't try account for evictions at all.
I've just traced the global swapin/out paths as well and those seem fine for memcg at this point, since they are called only after populate/unpopulate. Now I haven't addressed the new xe swap paths, because I don't have a test path, since amdgpu doesn't support those, I was thinking I'd leave it on the list for when amdgpu goes to that path, or I can spend some time on xe. Dave. > > > > What we need is to reserve the memory on BO allocation and commit it when > > the TT backend is populated. > > I'm not sure what reserve vs commit is here, mem cgroup is really just > reserve until you can reserve no more, it's just a single > charge/uncharge stage. If we try and charge and we are over the limit, > bad things will happen, either fail allocation or reclaim for the > cgroup. > > Regards, > Dave.