On 09/13/2016 04:35 PM, Luck, Tony wrote: > On Tue, Sep 13, 2016 at 04:13:04PM -0700, Dave Hansen wrote: >> Yikes, is this a new global lock and possible atomic_inc() on a shared >> variable in the fork() path? Has there been any performance or >> scalability testing done on this code? >> >> That mutex could be a disaster for fork() once the filesystem is >> mounted. Even if it goes away, if you have a large number of processes >> in an rdtgroup and they are forking a lot, you're bound to see the >> rdtgrp->refcount get bounced around a lot. > > The mutex is (almost certainly) going away.
Oh, cool. That's good to know. > The atomic_inc() > is likely staying (but only applies to tasks that are in > resource groups other than the default one. But on a system > where we partition the cache between containers/VMs, that may > essentially be all processes. Yeah, that's what worries me. We had/have quite a few regressions from when something runs inside vs. outside of certain cgroups. We definitely don't want to be adding more of those. > We only really use the refcount to decide whether the group > can be removed ... since that is the rare operation, perhaps > we could put all the work there and have it count them with: > > n = 0; > rcu_read_lock(); > for_each_process(p) > if (p->rdtgroup == this_rdtgroup) > n++; > rcu_read_unlock(); > if (n != 0) > return -EBUSY; Yeah, that seems sane. I'm sure it can be optimized even more than that, but that at least gets it out of the fast path.