Hello!
 
I ran some experiments with pgbench to measure the initialization time and 
found that the time increases quadratically with the number of clients. It was 
surprising to me and I would like to understand a reason of such behavior.
 
Some details on how it was done:
 
1) I used the branch REL_16_STABLE (commit 2caa85f4).
 
2) The default system configuration was modified (CPU speed control, memory 
control, network, ram disk). Briefly:
    sudo cpupower frequency-set -g performance
    sudo cpupower idle-set -D0 
    sudo swapoff -a 
    sudo sh -c 'echo 16384  >/proc/sys/net/core/somaxconn'
    sudo sh -c 'echo 16384  >/proc/sys/net/core/netdev_max_backlog'
    sudo sh -c ‘echo 16384  >/proc/sys/net/ipv4/tcp_max_syn_backlog’ 
    numactl --membind=0 bash
    sudo mount -t tmpfs -o rw,size=512G tmpfs /mnt/ramdisk
    exit
Hyperthreading and cpu boost were disabled:
    echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost
Please note: When testing on a fast multi-core server with a large number of 
clients, when the speed of creation of new connections becomes very high, even 
with such kernel parameters an error may occur:
    pgbench:pgbench: error: connection to server on socket "/tmp/.s.PGSQL.5114" 
failed: Resource temporarily unavailable

In such case you need to apply the patch 
0001-Fix-fast-connection-rate-issue.patch (attached).
 
3) The server was configured as:
    ./configure --enable-debug --with-perl --with-icu --enable-depend 
--enable-tap-tests

4) Build and install on ramdrive:
    make -j$(nproc) -s && make install

5) DB initialization:
    /mnt/ramdisk/bin/initdb -k -D /mnt/ramdisk/data -U postgres

Add to the postgresql.conf:
    huge_pages = off #for the sake of test stability and reproducibility
    shared_buffers = 1GB
    max_connections = 16384

6) Start command:
    a) Start server (e.g on the first numa socket)
        /mnt/ramdisk/bin/pg_ctl -w -D /mnt/ramdisk/data start

    b) create test database and stop the server
        /mnt/ramdisk/bin/psql -U postgres -c 'create database bench'
        /mnt/ramdisk/bin/pg_ctl -w -D /mnt/ramdisk/data stop
7) pgbench commands:
Perform the single test sequence (I've got a dual socket server, so the server 
was running on the first socket while the clients were running on the second 
one):

    export PATH=/mnt/ramdisk/bin:$PATH
    export NUMACTL_CLIENT="--physcpubind=96-191 --membind=1"
    export NUMACTL_SERVER="--physcpubind=0-95 --membind=0"
    export CLIENTS=1024

    numactl $NUMACTL_SERVER pg_ctl -w -D /mnt/ramdisk/data start
    numactl $NUMACTL_CLIENT pgbench -U postgres -i -s100 bench
    numactl $NUMACTL_CLIENT psql -U postgres -d bench -c "checkpoint"
    numactl $NUMACTL_CLIENT pgbench -U postgres -c$CLIENTS -j$CLIENTS -t100 -S 
bench
    numactl $NUMACTL_SERVER pg_ctl -m smart -w -D /mnt/ramdisk/data stop
8) Measurements & Results
Before the measurements I rebooted host machine and configured the host as 
described above. After that I ran a script that did 30 measurements of init 
connection time per a given number of clients, average time and standard 
deviation were also calculated.
The measurements results are presented as graph an in table form:
 
Number of clientsAverage init time, ms1024~435 +-202048~1062 +-204096~3284 
+-408192~11617 +-12016384~43391 +-230

 
 
 
 
 
 
 
9) The Question
It turned out that the results correspond to a quadratic dependence like y ~ 
0.0002x^2 where x is a number of clients and y is init time (ms).
Here there is a question: is it expected behavior or a bug? What do you think? 
I appreciate any comments and opinions.

--
Best regards,
Alexander Potapov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
 

Attachment: 0001-Fix-fast-connection-rate-issue-for-v16.patch
Description: Binary data

Reply via email to