Hello! I ran some experiments with pgbench to measure the initialization time and found that the time increases quadratically with the number of clients. It was surprising to me and I would like to understand a reason of such behavior. Some details on how it was done: 1) I used the branch REL_16_STABLE (commit 2caa85f4). 2) The default system configuration was modified (CPU speed control, memory control, network, ram disk). Briefly: sudo cpupower frequency-set -g performance sudo cpupower idle-set -D0 sudo swapoff -a sudo sh -c 'echo 16384 >/proc/sys/net/core/somaxconn' sudo sh -c 'echo 16384 >/proc/sys/net/core/netdev_max_backlog' sudo sh -c ‘echo 16384 >/proc/sys/net/ipv4/tcp_max_syn_backlog’ numactl --membind=0 bash sudo mount -t tmpfs -o rw,size=512G tmpfs /mnt/ramdisk exit Hyperthreading and cpu boost were disabled: echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost Please note: When testing on a fast multi-core server with a large number of clients, when the speed of creation of new connections becomes very high, even with such kernel parameters an error may occur: pgbench:pgbench: error: connection to server on socket "/tmp/.s.PGSQL.5114" failed: Resource temporarily unavailable
In such case you need to apply the patch 0001-Fix-fast-connection-rate-issue.patch (attached). 3) The server was configured as: ./configure --enable-debug --with-perl --with-icu --enable-depend --enable-tap-tests 4) Build and install on ramdrive: make -j$(nproc) -s && make install 5) DB initialization: /mnt/ramdisk/bin/initdb -k -D /mnt/ramdisk/data -U postgres Add to the postgresql.conf: huge_pages = off #for the sake of test stability and reproducibility shared_buffers = 1GB max_connections = 16384 6) Start command: a) Start server (e.g on the first numa socket) /mnt/ramdisk/bin/pg_ctl -w -D /mnt/ramdisk/data start b) create test database and stop the server /mnt/ramdisk/bin/psql -U postgres -c 'create database bench' /mnt/ramdisk/bin/pg_ctl -w -D /mnt/ramdisk/data stop 7) pgbench commands: Perform the single test sequence (I've got a dual socket server, so the server was running on the first socket while the clients were running on the second one): export PATH=/mnt/ramdisk/bin:$PATH export NUMACTL_CLIENT="--physcpubind=96-191 --membind=1" export NUMACTL_SERVER="--physcpubind=0-95 --membind=0" export CLIENTS=1024 numactl $NUMACTL_SERVER pg_ctl -w -D /mnt/ramdisk/data start numactl $NUMACTL_CLIENT pgbench -U postgres -i -s100 bench numactl $NUMACTL_CLIENT psql -U postgres -d bench -c "checkpoint" numactl $NUMACTL_CLIENT pgbench -U postgres -c$CLIENTS -j$CLIENTS -t100 -S bench numactl $NUMACTL_SERVER pg_ctl -m smart -w -D /mnt/ramdisk/data stop 8) Measurements & Results Before the measurements I rebooted host machine and configured the host as described above. After that I ran a script that did 30 measurements of init connection time per a given number of clients, average time and standard deviation were also calculated. The measurements results are presented as graph an in table form: Number of clientsAverage init time, ms1024~435 +-202048~1062 +-204096~3284 +-408192~11617 +-12016384~43391 +-230 9) The Question It turned out that the results correspond to a quadratic dependence like y ~ 0.0002x^2 where x is a number of clients and y is init time (ms). Here there is a question: is it expected behavior or a bug? What do you think? I appreciate any comments and opinions. -- Best regards, Alexander Potapov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
0001-Fix-fast-connection-rate-issue-for-v16.patch
Description: Binary data