> > On 14.07.25 07:18, Dave Airlie wrote:
> > > From: Dave Airlie <airl...@redhat.com>
> > >
> > > This enables all the backend code to use the list lru in memcg mode,
> > > and set the shrinker to be memcg aware.
> > >
> > > It adds the loop case for when pooled pages end up being reparented
> > > to a higher memcg group, that newer memcg can search for them there
> > > and take them back.
> > >
> > > Signed-off-by: Dave Airlie <airl...@redhat.com>
> > >
> > > ---
> > > v2: just use the proper stats.
> > > ---
> > >  drivers/gpu/drm/ttm/ttm_pool.c | 126 ++++++++++++++++++++++++++-------
> > >  1 file changed, 102 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/drivers/gpu/drm/ttm/ttm_pool.c 
> > > b/drivers/gpu/drm/ttm/ttm_pool.c
> > > index a4f4e27c6a1f..1e6da2cc1f06 100644
> > > --- a/drivers/gpu/drm/ttm/ttm_pool.c
> > > +++ b/drivers/gpu/drm/ttm/ttm_pool.c
> > > @@ -142,7 +142,9 @@ static int ttm_pool_nid(struct ttm_pool *pool) {
> > >  }
> > >
> > >  /* Allocate pages of size 1 << order with the given gfp_flags */
> > > -static struct page *ttm_pool_alloc_page(struct ttm_pool *pool, gfp_t 
> > > gfp_flags,
> > > +static struct page *ttm_pool_alloc_page(struct ttm_pool *pool,
> > > +                                     struct obj_cgroup *objcg,
> > > +                                     gfp_t gfp_flags,
> > >                                       unsigned int order)
> > >  {
> > >       unsigned long attr = DMA_ATTR_FORCE_CONTIGUOUS;
> > > @@ -162,7 +164,10 @@ static struct page *ttm_pool_alloc_page(struct 
> > > ttm_pool *pool, gfp_t gfp_flags,
> > >               p = alloc_pages_node(pool->nid, gfp_flags, order);
> > >               if (p) {
> > >                       p->private = order;
> > > -                     mod_node_page_state(NODE_DATA(page_to_nid(p)), 
> > > NR_GPU_ACTIVE, (1 << order));
> > > +                     if (!mem_cgroup_charge_gpu_page(objcg, p, order, 
> > > gfp_flags, false)) {
> >
> > Thinking more about it that is way to late. At this point we can't fail the 
> > allocation any more.
> >
>
> I've tested it at least works, but there is a bit of a problem with
> it, because if we fail a 10 order allocation, it tries to fallback
> down the order hierarchy, when there is no point since it can't
> account the maximum size.
>
> > Otherwise we either completely break suspend or don't account system 
> > allocations to the correctly any more after resume.
>
> When you say suspend here, do you mean for VRAM allocations, normal
> system RAM allocations which are accounted here shouldn't have any
> effect on suspend/resume since they stay where they are. Currently it
> also doesn't try account for evictions at all.

I've just traced the global swapin/out paths as well and those seem
fine for memcg at this point, since they are called only after
populate/unpopulate. Now I haven't addressed the new xe swap paths,
because I don't have a test path, since amdgpu doesn't support those,
I was thinking I'd leave it on the list for when amdgpu goes to that
path, or I can spend some time on xe.

Dave.

> >
> > What we need is to reserve the memory on BO allocation and commit it when 
> > the TT backend is populated.
>
> I'm not sure what reserve vs commit is here, mem cgroup is really just
> reserve until you can reserve no more, it's just a single
> charge/uncharge stage. If we try and charge and we are over the limit,
> bad things will happen, either fail allocation or reclaim for the
> cgroup.
>
> Regards,
> Dave.

Reply via email to