Hi Hackers, I played pgbench with wrong parameters, and I found bug-candidate. The latest commit in my source is 3a09d75.
1. Do initdb and start. 2. Initialize schema and data with "scale factor" = 1. 3. execute following command many times: $ pgbench -c 101 -j 10 postgres Then, sometimes the negative " initial connection time" was returned. Lateyncy average is also strange. ``` $ pgbench -c 101 -j 10 postgres starting vacuum...end. pgbench: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: FATAL: sorry, too many clients already pgbench (PostgreSQL) 14.0 transaction type: <builtin: TPC-B (sort of)> scaling factor: 1 query mode: simple number of clients: 101 number of threads: 10 number of transactions per client: 10 number of transactions actually processed: 910/1010 latency average = 41387686.662 ms initial connection time = -372896921.586 ms tps = 0.002440 (without initial connection time) ``` I sought pgbench.c and found a reason. When a thread failed to get some connections, they do not fill any values to thread->bench_start in threadRun(). And if the failure is caused in the final thread (this means threads[nthreads - 1]->bench_start is zero), the following if-statement sets bench_start to zero. ``` 6494 /* first recorded benchmarking start time */ 6495 if (bench_start == 0 || thread->bench_start < bench_start) 6496 bench_start = thread->bench_start; ``` The wrong bench_start propagates to printResult() and then some invalid values are appered. ``` 6509 printResults(&stats, pg_time_now() - bench_start, conn_total_duration, 6510 bench_start - start_time, latency_late); ``` I cannot distinguish whether we have to fix it, but I attache the patch. This simply ignores a result when therad->bench_start is zero. How do you think? Best Regards, Hayato Kuroda FUJITSU LIMITED
ignore_failed_thread.patch
Description: ignore_failed_thread.patch