On 2020-02-27 12:26:36 -0800, Andres Freund wrote: > Hi, > > On 2015-09-25 20:35:45 +0200, Fabien COELHO wrote: > > > > Hello Tatsuo, > > > > > Hmmm... I never use -C. The formula seems ok: > > > > > > tps_exclude = normal_xacts / (time_include - > > > (INSTR_TIME_GET_DOUBLE(conn_total_time) / > > > nthreads)); > > > > Hmmm... it is not:-) > > > > I think that the degree of parallelism to consider is nclients, not > > nthreads: while connection time is accumulated in conn_time, other clients > > are possibly doing their transactions, in parallel, even if it is in the > > same thread, so it is not "stopped time" for all clients. It starts to > > matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in > > each thread may be arbitrary close to the whole time if there are many > > clients. > > I think this pretty much entirely broke the tps_exclude logic when not > using -C, especially when -c and -j differ. The wait time there is > actually per thread, not per client. > > In this example I set post_auth_delay=1s on the server. Pgbench > tells me: > pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S > tps = 897607.544862 (including connections establishing) > tps = 1004793.708611 (excluding connections establishing) > > pgbench -M prepared -c 180 -j 60 -T 10 -P1 -S > tps = 739502.979613 (including connections establishing) > tps = 822639.038779 (excluding connections establishing) > > pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S > tps = 376468.177081 (including connections establishing) > tps = 418554.527585 (excluding connections establishing) > > which pretty obviously is bogus. While I'd not expect it'd to work > perfectly, the "excluding" number should stay roughly constant. > > > The fundamental issue is that without -C *none* of the connections in > each thread gets to actually execute work before all of them have > established a connection. So dividing conn_total_time by / nclients is > wrong. > > For more realistic connection delays this leads to the 'excluding' > number being way too close to the 'including' number, even if a > substantial portion of the time is spent on it.
Not suggesting it as an actual fix, but just multiplying the computed conn_time within runThread() by the number of connections the thread handles (for the non -C case) results in much more reasonable output. pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S before: tps = 378393.985650 (including connections establishing) tps = 420691.025930 (excluding connections establishing) after: tps = 379818.929680 (including connections establishing) tps = 957405.785600 (excluding connections establishing) pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S before: tps = 906662.031099 (including connections establishing) tps = 1012223.500880 (excluding connections establishing) after: tps = 906178.154876 (including connections establishing) tps = 1012431.154463 (excluding connections establishing) The numbers are still a bit bogus because of thread startup overhead being included, and conn_time being computed relative to the thread creation. But they at least make some basic sense. diff --git i/src/bin/pgbench/pgbench.c w/src/bin/pgbench/pgbench.c index 1159757acb0..3bc45107136 100644 --- i/src/bin/pgbench/pgbench.c +++ w/src/bin/pgbench/pgbench.c @@ -6265,6 +6265,16 @@ threadRun(void *arg) INSTR_TIME_SET_CURRENT(thread->conn_time); INSTR_TIME_SUBTRACT(thread->conn_time, thread->start_time); + /* add once for each other connection */ + if (!is_connect) + { + instr_time e = thread->conn_time; + for (i = 0; i < (nstate - 1); i++) + { + INSTR_TIME_ADD(thread->conn_time, e); + } + } + /* explicitly initialize the state machines */ for (i = 0; i < nstate; i++) { Greetings, Andres Freund