On 20.01.2025 17:20, Ilia Evdokimov wrote:
On 15.01.2025 20:16, Sami Imseih wrote:
Probably, but first I suggest benchmarking with sampling applied to
all queries. If the results are good, we can later filter certain
queries based on different characteristics.
Absolutely. The benchmark numbers to justify this feature are
the next step. Thanks for your work on this!
Regards,
Sami
Hi hackers,
I’d like to share the results of my benchmark.
To stress the spinlock in pg_stat_statements while incrementing
counters for popular entries, it's important to use the same query
repeatedly. To avoid overloading pgss with normalization, the queries
should not contain constants. I found that using the query 'SELECT
now()' works best for this purpose.
I ran the benchmark on a machine with 48 CPUs, which may not be
sufficient to fully test sampling, but I was able to achieve around
1.5 million TPS using 'SELECT now()'. To load the CPUs to about
85-90%, I ran pgbench with -c 45 -j 45, using a custom 'SELECT now()'
in pgbench_script.sql file containing 'SELECT now()'. The benchmark
was conducted on a newly created 'pgbench' database, , with processes
like autovacuum, fsync, and checkpoints disabled(e.g., checkpoint
_timeout = '24h').
I tested various 'sample_rate' values and compared the performance
with pgss both enabled and disabled. The detailed results are attached
in the 'sample_rate.txt' file, along with pgbench progress reports
every 10 seconds.
pgbench -c 45 -j 45 -T 200 -f pgbench_script.sql --progress=10 pgbench
sample_rate | num of transactions | lat | stddev | tps
1.0 | 1207127.572580 |0.037| 0.030 | 241400836
0.8 | 1403551.516338 |0.032| 0.031 | 280673286
0.5 | 1658596.614064 |0.027| 0.012 | 331679344
0.2 | 1757747.509476 |0.025| 0.008 | 351507156
0.0 | 1760055.986259 |0.025| 0.008 | 351965559
pgss off | 1828743.791205 |0.024| 0.008 | 365703762
If anyone has the capability to run this benchmark on machines with
more CPUs or with different queries, it would be nice. I’d appreciate
any suggestions or feedback.
--.
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Hi hackers,
I'd like to share the results of a new benchmark I conducted using the
pgbench tool with the -M prepared -S flags!
Previously, I shared a benchmark focused on extreme high-load scenarios,
but now I’ve conducted a more realistic test case based on Alexander’s
suggestion [0].
Machine Characteristics:
* 48 CPUs
* 374 GB RAM
* 3 TB Disk
To avoid interference from background processes, I disabled autovacuum
and synchronous operations using the following settings in
postgresql.auto.conf:
autovacuum = off
fsync = off
checkpoint_timeout = '24h'
shared_preload_libraries = 'pg_stat_statements'
Create a test database:
createdb pgbench
pgbench -i -s 3000 pgbench
If pg_stat_statements is enabled, reset its hash table before starting
the benchmark:
psql -c 'SELECT pg_stat_statements_reset()'
Run the benchmark with the following command:
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
Here are the results with and without pg_stat_statements, varying the
sample_rate parameter:
sample_rate | tps | lat | stddev | num of transactions
1.0 | 1299070.105018 |0.035| 0.055 | 259781895
0.8 | 1393327.873249 |0.033| 0.039 | 278630669
0.5 | 1420303.856480 |0.032| 0.037 | 284023396
0.2 | 1432482.672301 |0.032| 0.037 | 286461834
0.0 | 1760055.986259 |0.032| 0.036 | 290169584
pgss off | 1460920.805259 |0.031| 0.036 | 292144256
When pg_stat_statements is enabled, TPS decreases by about 10%. However,
by leveraging the new sampling feature (sample_rate), it is possible to
significantly mitigate this overhead. For detailed benchmark results
with the '-M prepared -S' flags, please see the attached file
'sample_rate_prepared.txt'
P.S. Previously, I shared results under extreme high-load conditions
using the query 'SELECT now()'. For clarity, I’m rewriting those results
for better readability:
sample_rate | tps | lat | stddev | num of transactions
1.0 | 1207127.572580 |0.037| 0.030 | 241400836
0.8 | 1403551.516338 |0.032| 0.031 | 280673286
0.5 | 1658596.614064 |0.027| 0.012 | 331679344
0.2 | 1757747.509476 |0.025| 0.008 | 351507156
0.0 | 1760055.986259 |0.025| 0.008 | 351965559
pgss off | 1828743.791205 |0.024| 0.008 | 365703762
Here, we observed a significant ~33% decrease in TPS when pgss was
enabled without sampling.
Any thoughts?
[0]:
https://www.postgresql.org/message-id/CAPpHfdsTKAQqC3A48-MGQhrhfEamXZPb64w%3Dutk7thQcOMNr7Q%40mail.gmail.com
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
pg_stat_statements.sample_rate = 1.0
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 314978.6 tps, lat 0.146 ms stddev 0.257, 0 failed
progress: 20.0 s, 849655.3 tps, lat 0.054 ms stddev 0.108, 0 failed
progress: 30.0 s, 1338227.1 tps, lat 0.034 ms stddev 0.045, 0 failed
progress: 40.0 s, 1375800.0 tps, lat 0.033 ms stddev 0.042, 0 failed
progress: 50.0 s, 1383403.2 tps, lat 0.033 ms stddev 0.039, 0 failed
progress: 60.0 s, 1381348.2 tps, lat 0.033 ms stddev 0.044, 0 failed
progress: 70.0 s, 1386079.3 tps, lat 0.033 ms stddev 0.040, 0 failed
progress: 80.0 s, 1372314.9 tps, lat 0.033 ms stddev 0.040, 0 failed
progress: 90.0 s, 1383231.2 tps, lat 0.033 ms stddev 0.042, 0 failed
progress: 100.0 s, 1376702.3 tps, lat 0.033 ms stddev 0.042, 0 failed
progress: 110.0 s, 1375622.4 tps, lat 0.033 ms stddev 0.043, 0 failed
progress: 120.0 s, 1387577.0 tps, lat 0.033 ms stddev 0.040, 0 failed
progress: 130.0 s, 1383540.3 tps, lat 0.033 ms stddev 0.039, 0 failed
progress: 140.0 s, 1383138.6 tps, lat 0.033 ms stddev 0.040, 0 failed
progress: 150.0 s, 1377247.6 tps, lat 0.033 ms stddev 0.042, 0 failed
progress: 160.0 s, 1382830.9 tps, lat 0.033 ms stddev 0.039, 0 failed
progress: 170.0 s, 1383637.7 tps, lat 0.033 ms stddev 0.041, 0 failed
progress: 180.0 s, 1381811.7 tps, lat 0.033 ms stddev 0.041, 0 failed
progress: 190.0 s, 1382384.7 tps, lat 0.033 ms stddev 0.041, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 259781895
number of failed transactions: 0 (0.000%)
latency average = 0.035 ms
latency stddev = 0.055 ms
initial connection time = 26.448 ms
tps = 1299070.105018 (without initial connection time)
pgbench=# select query, calls from pg_stat_statements where calls > 1000;
query | calls
------------------------------------------------------+-----------
SELECT abalance FROM pgbench_accounts WHERE aid = $1 | 259781895
(1 row)
pgbench=# show pg_stat_statements.sample_rate ;
pg_stat_statements.sample_rate
--------------------------------
1
(1 row)
pgbench=# select count(*) from pg_stat_statements;
count
-------
130
(1 row)
========================================================================
pg_stat_statements.sample_rate = 0.8
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 319046.3 tps, lat 0.144 ms stddev 0.257, 0 failed
progress: 20.0 s, 893116.1 tps, lat 0.051 ms stddev 0.102, 0 failed
progress: 30.0 s, 1440539.1 tps, lat 0.032 ms stddev 0.025, 0 failed
progress: 40.0 s, 1484261.8 tps, lat 0.031 ms stddev 0.015, 0 failed
progress: 50.0 s, 1483957.5 tps, lat 0.031 ms stddev 0.018, 0 failed
progress: 60.0 s, 1491030.9 tps, lat 0.031 ms stddev 0.014, 0 failed
progress: 70.0 s, 1483171.0 tps, lat 0.031 ms stddev 0.014, 0 failed
progress: 80.0 s, 1475368.5 tps, lat 0.031 ms stddev 0.015, 0 failed
progress: 90.0 s, 1487719.7 tps, lat 0.031 ms stddev 0.015, 0 failed
progress: 100.0 s, 1485387.0 tps, lat 0.031 ms stddev 0.016, 0 failed
progress: 110.0 s, 1474178.4 tps, lat 0.031 ms stddev 0.020, 0 failed
progress: 120.0 s, 1489178.2 tps, lat 0.031 ms stddev 0.016, 0 failed
progress: 130.0 s, 1480493.7 tps, lat 0.031 ms stddev 0.015, 0 failed
progress: 140.0 s, 1477666.3 tps, lat 0.031 ms stddev 0.014, 0 failed
progress: 150.0 s, 1483067.4 tps, lat 0.031 ms stddev 0.016, 0 failed
progress: 160.0 s, 1480664.3 tps, lat 0.031 ms stddev 0.014, 0 failed
progress: 170.0 s, 1482848.3 tps, lat 0.031 ms stddev 0.019, 0 failed
progress: 180.0 s, 1482780.2 tps, lat 0.031 ms stddev 0.015, 0 failed
progress: 190.0 s, 1487093.0 tps, lat 0.031 ms stddev 0.015, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 278630669
number of failed transactions: 0 (0.000%)
latency average = 0.033 ms
latency stddev = 0.039 ms
initial connection time = 26.004 ms
tps = 1393327.873249 (without initial connection time)
pgbench=# select query, calls from pg_stat_statements where calls > 1000;
query | calls
------------------------------------------------------+-----------
SELECT abalance FROM pgbench_accounts WHERE aid = $1 | 222902226
(1 row)
pgbench=# show pg_stat_statements.sample_rate ;
pg_stat_statements.sample_rate
--------------------------------
0.8
(1 row)
pgbench=# select count(*) from pg_stat_statements;
count
-------
117
(1 row)
========================================================================
pg_stat_statements.sample_rate = 0.5
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 313361.7 tps, lat 0.146 ms stddev 0.263, 0 failed
progress: 20.0 s, 907177.0 tps, lat 0.051 ms stddev 0.103, 0 failed
progress: 30.0 s, 1467741.3 tps, lat 0.031 ms stddev 0.022, 0 failed
progress: 40.0 s, 1514111.4 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 50.0 s, 1508402.2 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 60.0 s, 1511752.6 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 70.0 s, 1505456.0 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 80.0 s, 1502666.4 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 90.0 s, 1512033.1 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 100.0 s, 1503996.7 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 110.0 s, 1506571.7 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 120.0 s, 1523809.6 tps, lat 0.030 ms stddev 0.013, 0 failed
progress: 130.0 s, 1515853.1 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 140.0 s, 1511753.9 tps, lat 0.030 ms stddev 0.010, 0 failed
progress: 150.0 s, 1511626.1 tps, lat 0.030 ms stddev 0.010, 0 failed
progress: 160.0 s, 1513088.5 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 170.0 s, 1520756.9 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 180.0 s, 1519851.7 tps, lat 0.030 ms stddev 0.010, 0 failed
progress: 190.0 s, 1517730.6 tps, lat 0.030 ms stddev 0.009, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 284023396
number of failed transactions: 0 (0.000%)
latency average = 0.032 ms
latency stddev = 0.037 ms
initial connection time = 27.242 ms
tps = 1420303.856480 (without initial connection time)
pgbench=# select query, calls from pg_stat_statements where calls > 1000;
query | calls
------------------------------------------------------+-----------
SELECT abalance FROM pgbench_accounts WHERE aid = $1 | 142014320
(1 row)
pgbench=# show pg_stat_statements.sample_rate ;
pg_stat_statements.sample_rate
--------------------------------
0.5
(1 row)
pgbench=# select count(*) from pg_stat_statements;
count
-------
83
(1 row)
========================================================================
pg_stat_statements.sample_rate = 0.2
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 305735.8 tps, lat 0.150 ms stddev 0.266, 0 failed
progress: 20.0 s, 886211.5 tps, lat 0.052 ms stddev 0.105, 0 failed
progress: 30.0 s, 1477449.3 tps, lat 0.031 ms stddev 0.023, 0 failed
progress: 40.0 s, 1533608.6 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 50.0 s, 1520686.0 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 60.0 s, 1521163.7 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 70.0 s, 1506093.3 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 80.0 s, 1521926.6 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 90.0 s, 1532715.4 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 100.0 s, 1536818.2 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 110.0 s, 1532156.5 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 120.0 s, 1531077.6 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 130.0 s, 1533445.0 tps, lat 0.030 ms stddev 0.011, 0 failed
progress: 140.0 s, 1530465.1 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 150.0 s, 1533264.8 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 160.0 s, 1531706.4 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 170.0 s, 1527768.1 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 180.0 s, 1523762.9 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 190.0 s, 1531264.7 tps, lat 0.030 ms stddev 0.008, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 286461834
number of failed transactions: 0 (0.000%)
latency average = 0.032 ms
latency stddev = 0.037 ms
initial connection time = 26.189 ms
tps = 1432482.672301 (without initial connection time)
pgbench=# select query, calls from pg_stat_statements where calls > 1000;
query | calls
------------------------------------------------------+----------
SELECT abalance FROM pgbench_accounts WHERE aid = $1 | 57294631
(1 row)
pgbench=# show pg_stat_statements.sample_rate ;
pg_stat_statements.sample_rate
--------------------------------
0.2
(1 row)
pgbench=# select count(*) from pg_stat_statements;
count
-------
56
(1 row)
========================================================================
pg_stat_statements.sample_rate = 0.0
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 308612.3 tps, lat 0.149 ms stddev 0.263, 0 failed
progress: 20.0 s, 904494.3 tps, lat 0.051 ms stddev 0.103, 0 failed
progress: 30.0 s, 1497609.8 tps, lat 0.031 ms stddev 0.023, 0 failed
progress: 40.0 s, 1549945.7 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 50.0 s, 1552001.6 tps, lat 0.030 ms stddev 0.009, 0 failed
progress: 60.0 s, 1550998.6 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 70.0 s, 1552726.1 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 80.0 s, 1551847.2 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 90.0 s, 1542375.5 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 100.0 s, 1552666.8 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 110.0 s, 1542779.7 tps, lat 0.030 ms stddev 0.010, 0 failed
progress: 120.0 s, 1533717.7 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 130.0 s, 1532413.9 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 140.0 s, 1544919.6 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 150.0 s, 1539568.5 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 160.0 s, 1551072.9 tps, lat 0.030 ms stddev 0.007, 0 failed
progress: 170.0 s, 1558352.5 tps, lat 0.029 ms stddev 0.009, 0 failed
progress: 180.0 s, 1556477.4 tps, lat 0.029 ms stddev 0.007, 0 failed
progress: 190.0 s, 1549669.6 tps, lat 0.030 ms stddev 0.007, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 290169584
number of failed transactions: 0 (0.000%)
latency average = 0.032 ms
latency stddev = 0.036 ms
initial connection time = 27.090 ms
tps = 1451036.547496 (without initial connection time)
pgbench=# select query, calls from pg_stat_statements where calls > 1000;
query | calls
-------+-------
(0 rows)
pgbench=# show pg_stat_statements.sample_rate ;
pg_stat_statements.sample_rate
--------------------------------
0
(1 row)
pgbench=# select count(*) from pg_stat_statements;
count
-------
0
(1 row)
========================================================================
pg_stat_statements is off
pgbench -c 46 -j 46 -T 200 -M prepared -S --progress=10 pgbench
pgbench (16.6)
starting vacuum...end.
progress: 10.0 s, 316410.4 tps, lat 0.145 ms stddev 0.257, 0 failed
progress: 20.0 s, 919027.2 tps, lat 0.050 ms stddev 0.103, 0 failed
progress: 30.0 s, 1513827.2 tps, lat 0.030 ms stddev 0.021, 0 failed
progress: 40.0 s, 1555707.4 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 50.0 s, 1552820.8 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 60.0 s, 1564985.9 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 70.0 s, 1559507.4 tps, lat 0.029 ms stddev 0.007, 0 failed
progress: 80.0 s, 1561871.2 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 90.0 s, 1563076.5 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 100.0 s, 1562030.4 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 110.0 s, 1555040.3 tps, lat 0.029 ms stddev 0.008, 0 failed
progress: 120.0 s, 1557295.8 tps, lat 0.029 ms stddev 0.008, 0 failed
progress: 130.0 s, 1552556.9 tps, lat 0.029 ms stddev 0.008, 0 failed
progress: 140.0 s, 1547155.3 tps, lat 0.030 ms stddev 0.006, 0 failed
progress: 150.0 s, 1543266.5 tps, lat 0.030 ms stddev 0.008, 0 failed
progress: 160.0 s, 1561086.8 tps, lat 0.029 ms stddev 0.007, 0 failed
progress: 170.0 s, 1558993.9 tps, lat 0.029 ms stddev 0.009, 0 failed
progress: 180.0 s, 1554616.8 tps, lat 0.029 ms stddev 0.006, 0 failed
progress: 190.0 s, 1557972.0 tps, lat 0.029 ms stddev 0.007, 0 failed
transaction type: <builtin: select only>
scaling factor: 3000
query mode: prepared
number of clients: 46
number of threads: 46
maximum number of tries: 1
duration: 200 s
number of transactions actually processed: 292144256
number of failed transactions: 0 (0.000%)
latency average = 0.031 ms
latency stddev = 0.036 ms
initial connection time = 29.085 ms
tps = 1460920.805259 (without initial connection time)