Re: Major pgbench synthetic SELECT workload regression, Ubuntu 23.04+PG15

Andres Freund Thu, 08 Jun 2023 15:18:28 -0700

Hi,

On 2023-06-08 15:08:57 -0400, Gregory Smith wrote:
> Pushing SELECT statements at socket speeds with prepared statements is a
> synthetic benchmark that normally demos big pgbench numbers.  My benchmark
> farm moved to Ubuntu 23.04/kernel 6.2.0-20 last month, and that test is
> badly broken on the system PG15 at larger core counts, with as much as an
> 85% drop from expectations.  Since this is really just a benchmark workload
> the user impact is very narrow, probably zero really, but as the severity
> of the problem is high we should get to the bottom of what's going on.



> First round of profile data suggests the lost throughput is going here:
> Overhead  Shared Object          Symbol
>   74.34%  [kernel]               [k] osq_lock
>    2.26%  [kernel]               [k] mutex_spin_on_owner

Could you get a profile with call graphs? We need to know what leads to all
those osq_lock calls.

perf record --call-graph dwarf -a sleep 1

or such should do the trick, if run while the workload is running.


> Quick test to find if you're impacted:  on the server and using sockets,
> run a 10 second SELECT test with/without preparation using 1 or 2
> clients/[core|thread] and see if preparation is the slower result.  Here's
> a PGDG PG14 on port 5434 as a baseline, next to Ubuntu 23.04's regular
> PG15, all using the PG15 pgbench on AMD 5950X:

I think it's unwise to compare builds of such different vintage. The compiler
options and compiler version can have substantial effects.


> $ pgbench -i -s 100 pgbench -p 5434
> $ pgbench -S -T 10 -c 32 -j 32 -M prepared -p 5434 pgbench
> pgbench (14.8 (Ubuntu 14.8-1.pgdg23.04+1))
> tps = 1058195.197298 (without initial connection time)

I recommend also using -P1. Particularly when using unix sockets, the
specifics of how client threads and server threads are scheduled plays a huge
role. How large a role can change significantly between runs and between
fairly minor changes to how things are executed (e.g. between major PG
versions).

E.g. on my workstation (two sockets, 10 cores/20 threads each), with 32
clients, performance changes back and forth between ~600k and ~850k. Whereas
with 42 clients, it's steadily at 1.1M, with little variance.

I also have seen very odd behaviour on larger machines when
/proc/sys/kernel/sched_autogroup_enabled is set to 1.


> There's been plenty of recent chatter on LKML about *osq_lock*, in January
> Intel reported a 20% benchmark regression on UnixBench that might be
> related.  Work is still ongoing this week:

I've seen such issues in the past, primarily due to contention internal to
cgroups, when the memory controller is enabled.  IIRC that could be alleviated
to a substantial degree with cgroup.memory=nokmem.

Greetings,

Andres Freund

Re: Major pgbench synthetic SELECT workload regression, Ubuntu 23.04+PG15

Reply via email to