Hi,

On 2021-07-20 14:57:15 -0400, Alvaro Herrera wrote:
> On 2021-Jul-20, Andres Freund wrote:
> 
> > I think what's happening is that the first recvfrom() actually gets all 7
> > connection results. The server doesn't have any queries to process at that
> > point. But we ask the kernel whether there is new network input over and 
> > over
> > again, despite having results to process!
> 
> Hmm, yeah, that seems a missed opportunity.

> > with-isbusy:
> > ...
> > tps = 3990.424742 (without initial connection time)
> > ...
> >           1,013.71 msec task-clock                #    0.202 CPUs utilized
> >             80,203      raw_syscalls:sys_enter    #   79.119 K/sec
> >             19,947      context-switches          #   19.677 K/sec
> >      2,943,676,361      cycles:u                  #    2.904 GHz
> >        346,607,769      cycles:k                  #    0.342 GHz
> >      8,464,188,379      instructions:u            #    2.88  insn per cycle
> >        226,665,530      instructions:k            #    0.65  insn per cycle
> 
> This is quite compelling.
> 
> If you don't mind I can get this pushed soon in the next couple of days
> -- or do you want to do it yourself?

I was thinking of pushing the attached, to both 14 and master, thinking
that was what you meant, but then I wasn't quite sure: It's a relatively
minor performance improvement, after all? OTOH, it arguably also just is
a bit of an API misuse...

I'm inclined to push it to 14 and master, but ...

Greetings,

Andres Freund
>From 336497ae1df22b966c3b05826987eb3b374f551a Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Wed, 21 Jul 2021 16:40:43 -0700
Subject: [PATCH] pgbench: When using pipelining only do PQconsumeInput() when
 necessary.

Up to now we did a PQconsumeInput() for each pipelined query, asking the OS
for more input - which it often won't have, as all results might already have
been sent. That turns out to have a noticeable performance impact.

Alvaro Herrera reviewed the idea to add the PQisBusy() check, but not this
concrete patch.

Author: Andres Freund <and...@anarazel.de>
Discussion: https://postgr.es/m/20210720180039.23rivhdft3l4m...@alap3.anarazel.de
Backpatch: 14, where libpq/pgbench pipelining was introduced.
---
 src/bin/pgbench/pgbench.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 364b5a2e47d..129cf2ed61d 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -3460,7 +3460,14 @@ advanceConnectionState(TState *thread, CState *st, StatsData *agg)
 				 */
 			case CSTATE_WAIT_RESULT:
 				pg_log_debug("client %d receiving", st->id);
-				if (!PQconsumeInput(st->con))
+
+				/*
+				 * Only check for new network data if we processed all data
+				 * fetched prior. Otherwise we end up doing a syscall for each
+				 * individual pipelined query, which has a measurable
+				 * performance impact.
+				 */
+				if (PQisBusy(st->con) && !PQconsumeInput(st->con))
 				{
 					/* there's something wrong */
 					commandFailed(st, "SQL", "perhaps the backend died while processing");
-- 
2.32.0.rc2

Reply via email to