On Tue, Feb 12, 2019 at 12:34:09PM -0800, Andrew Morton wrote: > On Tue, 12 Feb 2019 13:47:24 -0500 Johannes Weiner <han...@cmpxchg.org> wrote: > > > On Tue, Feb 12, 2019 at 09:56:45AM -0800, Roman Gushchin wrote: > > > The patchset contains few changes to the vmalloc code, which are > > > leading to some performance gains and code simplification. > > > > > > Also, it exports a number of pages, used by vmalloc(), > > > in /proc/meminfo. > > > > > > Patch (1) removes some redundancy on __vunmap(). > > > Patch (2) separates memory allocation and data initialization > > > in alloc_vmap_area() > > > Patch (3) adds vmalloc counter to /proc/meminfo. > > > > > > v2->v1: > > > - rebased on top of current mm tree > > > - switch from atomic to percpu vmalloc page counter > > > > I don't understand what prompted this change to percpu counters. > > > > All writers already write vmap_area_lock and vmap_area_list, so it's > > not really saving much. The for_each_possible_cpu() for /proc/meminfo > > on the other hand is troublesome. > > percpu_counters would fit here. They have probably-unneeded locking > but I expect that will be acceptable. > > And they address the issues with for_each_possible_cpu() avoidance, CPU > hotplug and transient negative values.
Using existing vmap_area_lock (as Johannes suggested) is also problematic, due to different life-cycles of vma_areas and vmalloc pages. A special flag will be required to decrease the counter during the lazy deletion of vmap_areas. Allocation path will require passing a bool flag through too many nested functions. Also it will be semi-accurate, which is probably tolerable. So, it's doable, but doesn't look nice to me. So, using a simple per-cpu counter still seems to best option. Transient negative value is a valid concern, but easily fixable. Are there any other? What's the problem with for_each_possible_cpu()? Reading /proc/meminfo is not that hot, no? Thanks!