On Mon, Jul 01, 2013 at 09:16:46AM -0700, Dave Hansen wrote: > On 06/28/2013 07:20 PM, Zheng Liu wrote: > >> > IOW, a process needing to do a bunch of MAP_POPULATEs isn't > >> > parallelizable, but one using this mechanism would be. > > I look at the code, and it seems that we will handle MAP_POPULATE flag > > after we release mmap_sem locking in vm_mmap_pgoff(): > > > > down_write(&mm->mmap_sem); > > ret = do_mmap_pgoff(file, addr, len, prot, flag, pgoff, > > &populate); > > up_write(&mm->mmap_sem); > > if (populate) > > mm_populate(ret, populate); > > > > Am I missing something? > > I went and did my same test using mmap(MAP_POPULATE)/munmap() pair > versus using MADV_POPULATE in 160 threads in parallel. > > MADV_POPULATE was about 10x faster in the threaded configuration. > > With MADV_POPULATE, the biggest cost is shipping the mmap_sem cacheline > around so that we can write the reader count update in to it. With > mmap(), there is a lot of _contention_ on that lock which is much, much > more expensive than simply bouncing a cacheline around.
Thanks for your explanation. FWIW, it would be great if we can let MAP_POPULATE flag support shared mappings because in our product system there has a lot of applications that uses mmap(2) and then pre-faults this mapping. Currently these applications need to pre-fault the mapping manually. Regards, - Zheng -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/