Hi, On 2020-02-17 13:12:37 +0900, Takashi Menjo wrote: > I applied my patchset that mmap()-s WAL segments as WAL buffers to > refs/tags/REL_12_0, and measured and analyzed its performance with > pgbench. Roughly speaking, When I used *SSD and ext4* to store WAL, > it was "obviously worse" than the original REL_12_0. VTune told me > that the CPU time of memcpy() called by CopyXLogRecordToWAL() got > larger than before.
FWIW, this might largely be because of page faults. In contrast to before we wouldn't reuse the same pages (because they've been munmap()/mmap()ed), so the first time they're touched, we'll incur page faults. Did you try mmap()ing with MAP_POPULATE? It's probably also worthwhile to try to use MAP_HUGETLB. Still doubtful it's the right direction, but I'd rather have good numbers to back me up :) Greetings, Andres Freund