> > It is usually not acceptable to run applications with > synchronous_commit=off, so once you have identified that the bottleneck is > in implementing synchronous_commit=on, you probably need to take a deep > dive into your hardware to figure out why it isn't performing the way you > need/want/expect it to. Tuning the server under synchronous_commit=off > when you don't intend to run your production server with that setting is > unlikely to be fruitful. >
I do not intend to run the server with synchronous_commit=off, but based on my limited knowledge, I'm wondering if all these observations are somehow related and are caused by the same underlying bottleneck (or misconfiguration): 1) At higher concurrency levels, TPS for synchronous_commit=off is lower for optimised settings when compared to default settings 2) At ALL concurrency levels, TPS for synchronous_commit=on is lower for optimised settings (irrespective of shared_buffers value), compared to default settings 3) At higher concurrency levels, optimised + synchronous_commit=on + shared_buffers=2G has HIGHER TPS than optimised + synchronous_commit=off + shared_buffers=8G Here are the (completely counter-intuitive) numbers for these observations: +--------+-----------------------------------------------------------------+------------------------+ | | synchronous_commit=on | synchronous_commit=off | +--------+-----------------------------------------------------------------+------------------------+ | client | Mostly defaults [1] | Optimised [2] | Optimised [2] | Optimised [2] | | | | + shared_buffers=2G | + shared_buffers=8G | + shared_buffers=8G | +--------+---------------------+---------------------+---------------------+------------------------+ | 1 | 80-86 | 74-77 | 75-75 | 169-180 | +--------+---------------------+---------------------+---------------------+------------------------+ | 6 | 350-376 | 301-304 | 295-300 | 1265-1397 | +--------+---------------------+---------------------+---------------------+------------------------+ | 12 | 603-619 | 476-488 | 485-493 | 1746-2352 | +--------+---------------------+---------------------+---------------------+------------------------+ | 24 | 947-1015 | 678-739 | 723-770 | 1869-2518 | +--------+---------------------+---------------------+---------------------+------------------------+ | 48 | 1435-1512 | 950-1043 | 1029-1086 | 1912-2818 | +--------+---------------------+---------------------+---------------------+------------------------+ | 96 | 1769-1811 | 3337-3459 | 1302-1346 | 1546-1753 | +--------+---------------------+---------------------+---------------------+------------------------+ | 192 | 1857-1992 | 3613-3715 | 1269-1345 | 1332-1508 | +--------+---------------------+---------------------+---------------------+------------------------+ | 384 | 1667-1793 | 3180-3300 | 1262-1364 | 1356-1450 | +--------+---------------------+---------------------+---------------------+------------------------+ > > In case you do intend to run with synchronous_commit=off, or if you are > just curious: running with a very high number of active connections often > reveals subtle bottlenecks and interactions, and is very dependent on your > hardware. Unless you actually intend to run our server with > synchronous_commit=off and with a large number of active connections, it is > probably not worth investigating this. > Please see the table above. The reason why I'm digging deeper into this is because of observation (2) above, i.e. I am unable to come up with any optimised setting that performs better than the default settings for the concurrency levels that I care about (100-150). > I'm more interested in the low end, you should do much better than those > reported numbers when clients=1 and synchronous_commit=off with the data on > SSD. I think you said that pgbench is running on a different machine than > the database, so perhaps it is just network overhead that is keeping this > value down. What happens if you run them on the same machine? > I'm currently running this, but the early numbers are surprising. For client=1, the numbers for optimised settings + shared_buffers=2G are: -- pgbench run over a 1Gbps network: 74-77 tps -- pgbench run on the same machine: 152-153 tps (is this absolute number good enough given my hardware?) Is 1 Gbps network the bottleneck? Does it explain the three observations given above? I'll wait for the current set of benchmarks to finish and re-run the benchmarks over the network and monitor network utilisation. [1] "Mostly default" settings are whatever ships with Ubuntu 18.04 + PG 11. A snippet of the relevant setts are given below: max_connection=400 work_mem=4MB maintenance_work_mem=64MB shared_buffers=128MB temp_buffers=8MB effective_cache_size=4GB wal_buffers=-1 wal_sync_method=fsync max_wal_size=1GB * autovacuum=off # Auto-vacuuming was disabled* [2] Optimized settings max_connections = 400 * shared_buffers = 8GB # or 2GB -- depending upon which scenario was being evaluated* effective_cache_size = 24GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100 random_page_cost = 1.1 effective_io_concurrency = 200 work_mem = 3495kB min_wal_size = 1GB max_wal_size = 2GB max_worker_processes = 12 max_parallel_workers_per_gather = 6 max_parallel_workers = 12 * autovacuum=off # Auto-vacuuming was disabled*