On Fri, Sep 11, 2020 at 4:04 PM Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > > On 2020/09/11 16:23, bttanakahbk wrote: > > > > pgbench: > > initialization: pgbench -i -s 100 > > benchmarking : pgbench -j16 -c128 -T180 -r -n -f <script> -h <address> > > -U <user> -p <port> -d <db> > > # VACUUMed and pg_prewarmed manually before run the benchmark > > query:SELECT 1; > >> pgss_lwlock_v2.patch track_planning TPS decline rate s_lock > >> CPU usage > >> - OFF 810509.4 standard 0.17% > >> 98.8%(sys24.9%,user73.9%) > >> - ON 732823.1 -9.6% 1.94% > >> 95.1%(sys22.8%,user72.3%) > >> + OFF 371035.0 -49.4% - > >> 65.2%(sys20.6%,user44.6%) > >> + ON 193965.2 -47.7% - > >> 41.8%(sys12.1%,user29.7%) > > # "-" is showing that s_lock was not reported from the perf. > > Ok, so my proposed patch degrated the performance in this case :( > This means that replacing spinlock with lwlock in pgss is not proper > approach for the lock contention issue on pgss... > > I proposed to split the spinlock for each pgss entry into two > to reduce the lock contention, upthread. One is for planner stats, > and the other is for executor stats. Is it worth working on > this approach as an alternative idea? Or does anyone have any better idea?
For now only calls and [min|max|mean|total]_time are split between planning and execution, so we'd have to do the same for the rest of the counters to be able to have 2 different spinlocks. That'll increase the size of the struct quite a lot, and we'd also have to change the SRF output, which is already quite wide.