On 2020/09/11 16:23, bttanakahbk wrote:
Hi,
On 2020-08-19 00:43, Fujii Masao wrote:
Yes, I pushed the document_overhead_by_track_planning.patch, but this
CF entry is for pgss_lwlock_v1.patch which replaces spinlocks with lwlocks
in pg_stat_statements. The latter patch has not been committed yet.
Probably attachding the different patches in the same thread would cause
this confusing thing... Anyway, thanks for your comment!
To avoid further confusion, I attached the rebased version of
the patch that was registered at CF. I'd appreciate it if
you review this version.
I tested pgss_lwlock_v2.patch with 3 workloads. And I couldn't observe
performance
improvement in our environment and I'm afraid to say that even worser in some
case.
- Workload1: pgbench select-only mode
- Workload2: pgbench custom scripts which run "SELECT 1;"
- Workload3: pgbench custom scripts which run 1000 types of different simple
queries
Thanks for running the benchmarks!
- Workload1
First we set the pg_stat_statements.track_planning to on/off and run the
fully-cached pgbench
select-only mode on pg14head which is installed in on-premises server(32CPU,
256GB mem).
However in this enveronment we couldn't reproduce 45% performance drop due to
s_lock conflict
(Tharakan-san mentioned in his post on
2895b53b033c47ccb22972b589050...@ex13d05uwc001.ant.amazon.com).
- Workload2
Then we adopted pgbench custom script "SELECT 1;" which supposed to increase
the s_lock and
make it easier to reproduce the issue. In this case around 10% of performance
decrease
which also shows slightly increase in s_lock (~10%). With this senario, despite
a s_lock
absence, the patch shows more than 50% performance degradation regardless of
track_planning.
And also we couldn't see performance improvement in this workload.
pgbench:
initialization: pgbench -i -s 100
benchmarking : pgbench -j16 -c128 -T180 -r -n -f <script> -h <address> -U <user> -p
<port> -d <db>
# VACUUMed and pg_prewarmed manually before run the benchmark
query:SELECT 1;
pgss_lwlock_v2.patch track_planning TPS decline rate s_lock CPU
usage
- OFF 810509.4 standard 0.17%
98.8%(sys24.9%,user73.9%)
- ON 732823.1 -9.6% 1.94%
95.1%(sys22.8%,user72.3%)
+ OFF 371035.0 -49.4% -
65.2%(sys20.6%,user44.6%)
+ ON 193965.2 -47.7% -
41.8%(sys12.1%,user29.7%)
# "-" is showing that s_lock was not reported from the perf.
Ok, so my proposed patch degrated the performance in this case :(
This means that replacing spinlock with lwlock in pgss is not proper
approach for the lock contention issue on pgss...
I proposed to split the spinlock for each pgss entry into two
to reduce the lock contention, upthread. One is for planner stats,
and the other is for executor stats. Is it worth working on
this approach as an alternative idea? Or does anyone have any better idea?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION