On Tue, 27 May 2025 at 08:08, Dave Chinner <da...@fromorbit.com> wrote:
>
> On Tue, May 27, 2025 at 06:32:30AM +1000, Dave Airlie wrote:
> > Hey all,
> >
> > Hope someone here can help me work this out, I've been studying
> > list_lru a bit this week for possible use in the GPU driver memory
> > pool code.
> >
> > I understand that when a cgroup goes away, it's lru resources get
> > reparented into the parent resource, however I'm wondering about
> > operation in the opposite direction and whether this is possible or
> > something we'd like to add.
>
> It's possible, but you need to write the code yourself.
>
> You might want to look at the zswap code, it has a memcg-aware
> global object LRU that charges individual entries to the memcg that
> use space in the pool.
>
> > Scenario:
> > 1. Toplevel cgroup - empty LRU
> > 2. Child cgroup A created, adds a bunch of special pages to the LRU
> > 3. Child cgroup A dies, pages in lru list get reparented to toplevel cgroup
> > 4. Child cgroup B created. Now if B wants to get special pages from
> > the pool, is it possible for B to get access to the LRU from the
> > toplevel cgroup automatically?
> >
> > Ideally B would takes pages from the
> > parent LRU, and put them back into it's LRU, and then reuse the ones
> > from it's LRU, and only finally allocate new special pages once it has
> > none and the parent cgroup has none as well.
>
> The list_lru has nothing to do with what context gets a new
> reference to the objects on the LRU. This is something that your
> pool object lookup/allocation interface would do.
>
> If your lookup interface is cgroup aware, it can look up the parent,
> search it's pool and dequeue from the LRU via:
>
>         parent_memcg = parent_mem_cgroup(child_memcg);
>         <lookup object>
>         list_lru_del(<object> ..., parent_memcg);
>
> parent_memcg). When the child is done with it, it can add it back to
> it's own LRU via:
>
>         list_lru_add(...., child_memcg).

Thanks Dave,

So this seems like something that would need to recurse up to the root
cgroup, which makes me wonder if generic code could/should provide it.

list_lru_walk_node already does a bit of policy here, where it walks
the non-memcg lru, then walks the per-memcg ones,

I kinda need that but in reverse, where it walks the memcg, then its
ancestors, then the non-memcg lru, just wondering if that makes sense
in common code like list_lru_walk_node does?

>
> > I'm just not seeing where the code for 4 happens, but I'm not fully
> > across this all yet either,
>
> You won't find it, because it doesn't do 4) at all - that's consumer
> side functionality, not generic functionality. If you want to have a
> pool that is owned by a parent memcg and charge/track it to a child
> memcg on allocation, then you need to write the pool management code
> that performs this management. The APIs are there to build this sort
> of thing, but it's not generic functionality the list_lru provides.

I have the pool bits, just wasn't sure how generic the code to
traverse the memcg lrus from the child to the root to see if any level
has some pages in it's lru. I can write it in the consumer, but I do
think it's quite like list_lru_walk_node just with a different
allocation strategy.

Dave.

Reply via email to