connection establishment versus parallel workers

Nathan Bossart Wed, 11 Dec 2024 12:43:18 -0800

My team recently received a report about connection establishment times
increasing substantially from v16 onwards.  Upon further investigation,
this seems to have something to do with commit 7389aad (which moved a lot
of postmaster code out of signal handlers) in conjunction with workloads
that generate many parallel workers.  I've attached a set of reproduction
steps.  The issue seems to be worst on larger machines (e.g., r8g.48xlarge,
r5.24xlarge) when max_parallel_workers/max_worker_process is set very high
(>= 48).


Our theory is that commit 7389aad (and follow-ups like commit 239b175) made
parallel worker processing much more responsive to the point of contending
with incoming connections, and that before this change, the kernel balanced
the execution of the signal handlers and ServerLoop() to prevent this.  I
don't have a concrete proposal yet, but I thought it was still worth
starting a discussion.  TBH I'm not sure we really need to do anything
since this arguably comes down to a trade-off between connection and worker
responsiveness.

-- 
nathan

setup:

psql -h "$host" -U "$user" postgres <<EOF
create database mydb;
\c mydb
create table mytable (a int);
insert into mytable (a) select * from generate_series(1, 500000);
EOF

--

start many parallel workers (may need to change -c and -j):

pgbench -h "$host" -U "$user" mydb -c 30 -j 2 -T 300 -f <(echo '
begin;
select count(*) from mytable;
end;')

--

while pgbench is running, check connection establishment times:

while :; do
  time pg_isready -h "$host" -U "$user" -d postgres
  sleep 1
done

connection establishment versus parallel workers

Reply via email to