Hey, I've been tasked to look into this, and I'm going start from hopeless naivety and see how far I can get. This is an initial attempt to hook TTM system memory allocations into memcg and account for them.
It does: 1. Adds memcg GPU statistic, 2. Adds TTM memcg pointer for drivers to set on their user object allocation paths 3. Adds a singular path where we account for memory in TTM on cached non-pooled non-dma allocations. Cached memory allocations used to be pooled but we dropped that a while back which makes them the best target to start attacking this from. 4. It only accounts for memory that is allocated directly from a userspace TTM operation (like page faults or validation). It *doesn't* account for memory allocated in eviction paths due to device memory pressure. This seems to work for me here on my hacked up tests systems at least, I can see the GPU stats moving and they look sane. Future work: Account for pooled non-cached Account for pooled dma allocations (no idea how that looks) Figure out if accounting for eviction is possible, and what it might look like. Dave.