Hello Kuroda-san, On Fri, 11 Jun 2021 08:58:45 +0000 "kuroda.hay...@fujitsu.com" <kuroda.hay...@fujitsu.com> wrote:
> Hi Hackers, > > I played pgbench with wrong parameters, and I found bug-candidate. > The latest commit in my source is 3a09d75. > > 1. Do initdb and start. > 2. Initialize schema and data with "scale factor" = 1. > 3. execute following command many times: > > $ pgbench -c 101 -j 10 postgres > > Then, sometimes the negative " initial connection time" was returned. > Lateyncy average is also strange. > > ``` > $ pgbench -c 101 -j 10 postgres > starting vacuum...end. > pgbench: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: > FATAL: sorry, too many clients already > pgbench (PostgreSQL) 14.0 > transaction type: <builtin: TPC-B (sort of)> > scaling factor: 1 > query mode: simple > number of clients: 101 > number of threads: 10 > number of transactions per client: 10 > number of transactions actually processed: 910/1010 > latency average = 41387686.662 ms > initial connection time = -372896921.586 ms > tps = 0.002440 (without initial connection time) > ``` > > I sought pgbench.c and found a reason. > When a thread failed to get some connections, they do not fill any values to > thread->bench_start in threadRun(). > And if the failure is caused in the final thread (this means threads[nthreads > - 1]->bench_start is zero), > the following if-statement sets bench_start to zero. > > ``` > 6494 /* first recorded benchmarking start time */ > 6495 if (bench_start == 0 || thread->bench_start < > bench_start) > 6496 bench_start = thread->bench_start; > ``` > > The wrong bench_start propagates to printResult() and then some invalid > values are appered. > > ``` > 6509 printResults(&stats, pg_time_now() - bench_start, > conn_total_duration, > 6510 bench_start - start_time, > latency_late); > ``` > > I cannot distinguish whether we have to fix it, but I attache the patch. > This simply ignores a result when therad->bench_start is zero. + /* skip if the thread faild to get connection */ + if (thread->bench_start == 0) + continue; It detects if a thread failed to get the initial connection by thread->bench_start == 0, but this assumes the initial value is zero. For ensuring this, I think it is better to initialize it in an early state, for example like this. @@ -6419,6 +6419,7 @@ main(int argc, char **argv) initRandomState(&thread->ts_throttle_rs); initRandomState(&thread->ts_sample_rs); thread->logfile = NULL; /* filled in later */ + thread->bench_start = 0; thread->latency_late = 0; initStats(&thread->stats, 0) typo: faild -> failed Regards, Yugo Nagata -- Yugo NAGATA <nag...@sraoss.co.jp>