On Wed, 10 Feb 2016 16:24:16 -0800 Tim Chen <tim.c.c...@linux.intel.com> wrote:
> On Wed, 2016-02-10 at 13:28 -0800, Andrew Morton wrote: > > > > > If a process is unmapping 4MB then it's pretty crazy for us to be > > hitting the percpu_counter 32 separate times for that single operation. > > > > Is there some way in which we can batch up the modifications within the > > caller and update the counter less frequently? Perhaps even in a > > single hit? > > I think the problem is the batch size is too small and we overflow > the local counter into the global counter for 4M allocations. That's one way of looking at the issue. The other way (which I point out above) is that we're calling vm_[un]_acct_memory too frequently when mapping/unmapping 4M segments. Exactly which mmap.c callsite is causing this issue?