Re: pgbench client-side performance issue on large scripts

Tom Lane Mon, 24 Feb 2025 12:16:27 -0800

"Daniel Verite" <dan...@manitou-mail.org> writes:
> On large scripts, pgbench happens to consume a lot of CPU time.
> For instance, with a script consisting of 50000 "SELECT 1;"
> I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time,
> out of a total time of 6.7 secs. When run with perf, this profile shows up:


You ran only a single execution of a 50K-line script?  This test
case feels a little bit artificial.  Having said that ...

> In ParseScript(), expr_scanner_get_lineno() is called for each line
> with its current offset, and it scans the script from the beginning
> up to the current line. I think that on the whole, parsing this script
> ends up looking at (N*(N+1))/2 lines, which is 1.275 billion lines
> if N=50000.

... yes, O(N^2) is not nice.  It has to be possible to do better.

> I wonder whether pgbench should materialize the current line number
> in a variable, as psql does in pset.lineno. But given that there are
> two different parsers in pgbench, maybe it's not the simplest.
> Flex has yylineno but neither pgbench nor psql make use of it.

Yeah, we do rely on yylineno in bootscanner.l and ecpg, but not
elsewhere; not sure if there's a performance reason for that.
I see that plpgsql has a hand-rolled version (look for cur_line_num)
that perhaps could be stolen.

                        regards, tom lane

Re: pgbench client-side performance issue on large scripts

Reply via email to