"Daniel Verite" <dan...@manitou-mail.org> writes: > On large scripts, pgbench happens to consume a lot of CPU time. > For instance, with a script consisting of 50000 "SELECT 1;" > I see "pgbench -f 50k-select.sql" taking about 5.8 secs of CPU time, > out of a total time of 6.7 secs. When run with perf, this profile shows up:
You ran only a single execution of a 50K-line script? This test case feels a little bit artificial. Having said that ... > In ParseScript(), expr_scanner_get_lineno() is called for each line > with its current offset, and it scans the script from the beginning > up to the current line. I think that on the whole, parsing this script > ends up looking at (N*(N+1))/2 lines, which is 1.275 billion lines > if N=50000. ... yes, O(N^2) is not nice. It has to be possible to do better. > I wonder whether pgbench should materialize the current line number > in a variable, as psql does in pset.lineno. But given that there are > two different parsers in pgbench, maybe it's not the simplest. > Flex has yylineno but neither pgbench nor psql make use of it. Yeah, we do rely on yylineno in bootscanner.l and ecpg, but not elsewhere; not sure if there's a performance reason for that. I see that plpgsql has a hand-rolled version (look for cur_line_num) that perhaps could be stolen. regards, tom lane