Hi, On 2015-09-25 20:35:45 +0200, Fabien COELHO wrote: > > Hello Tatsuo, > > > Hmmm... I never use -C. The formula seems ok: > > > > tps_exclude = normal_xacts / (time_include - > > (INSTR_TIME_GET_DOUBLE(conn_total_time) / nthreads)); > > Hmmm... it is not:-) > > I think that the degree of parallelism to consider is nclients, not > nthreads: while connection time is accumulated in conn_time, other clients > are possibly doing their transactions, in parallel, even if it is in the > same thread, so it is not "stopped time" for all clients. It starts to > matter with "-j 1 -c 30" and slow transactions, the cumulated conn_time in > each thread may be arbitrary close to the whole time if there are many > clients.
I think this pretty much entirely broke the tps_exclude logic when not using -C, especially when -c and -j differ. The wait time there is actually per thread, not per client. In this example I set post_auth_delay=1s on the server. Pgbench tells me: pgbench -M prepared -c 180 -j 180 -T 10 -P1 -S tps = 897607.544862 (including connections establishing) tps = 1004793.708611 (excluding connections establishing) pgbench -M prepared -c 180 -j 60 -T 10 -P1 -S tps = 739502.979613 (including connections establishing) tps = 822639.038779 (excluding connections establishing) pgbench -M prepared -c 180 -j 30 -T 10 -P1 -S tps = 376468.177081 (including connections establishing) tps = 418554.527585 (excluding connections establishing) which pretty obviously is bogus. While I'd not expect it'd to work perfectly, the "excluding" number should stay roughly constant. The fundamental issue is that without -C *none* of the connections in each thread gets to actually execute work before all of them have established a connection. So dividing conn_total_time by / nclients is wrong. For more realistic connection delays this leads to the 'excluding' number being way too close to the 'including' number, even if a substantial portion of the time is spent on it. Greetings, Andres Freund