adf97c156 added support to allow ExprStates to support hashing and
adjusted Hash Join to make use of that. That allowed a speedup in hash
value generation as it allowed JIT compilation of hash values. It also
allowed more efficient tuple deforming as all required attributes are
deformed in one go rather than on demand when hashing each join key.

The attached does the same for GROUP BY and hashed SubPlans. The win
for the tuple deformation does not exist here, but there does seem to
be some gains still to be had from JIT compilation.

Using a scale=1 TPC-H lineitem table, I ran the attached script.

The increase is far from impressive, but likely worth migrating these
over to use ExprState too.

master:

alter system set jit = 0;
latency average = 1509.116 ms
latency average = 1502.496 ms
latency average = 1507.560 ms
alter system set jit = 1;
latency average = 1396.015 ms
latency average = 1392.138 ms
latency average = 1396.476 ms
alter system set jit_optimize_above_cost = 0;
latency average = 1290.463 ms
latency average = 1293.364 ms
latency average = 1290.366 ms
alter system set jit_inline_above_cost = 0;
latency average = 1294.540 ms
latency average = 1300.970 ms
latency average = 1302.181 ms

patched:

alter system set jit = 0;
latency average = 1500.183 ms
latency average = 1500.911 ms
latency average = 1504.150 ms (+0.31%)
alter system set jit = 1;
latency average = 1367.427 ms
latency average = 1367.329 ms
latency average = 1366.473 ms (+2.03%)
alter system set jit_optimize_above_cost = 0;
latency average = 1273.453 ms
latency average = 1265.348 ms
latency average = 1272.598 ms (+1.65%)
alter system set jit_inline_above_cost = 0;
latency average = 1264.657 ms
latency average = 1272.661 ms
latency average = 1273.179 ms (+2.29%)

David
#!/bin/bash

nloops=30000
rows=1000
dbname=postgres
port=5432
seconds=30

psql -c "drop table if exists hjtbl;" -p $port $dbname
psql -c "create table hjtbl (a int not null, b int not null, c int not null, d 
int not null, e int not null, f int not null);" -p $port $dbname
psql -c "insert into hjtbl select a,a,a,a,a,a from generate_series(1,$rows) a;" 
-p $port $dbname
psql -c "vacuum freeze analyze hjtbl;" -p $port $dbname
psql -c "alter system set jit_above_cost = 0;" -p $port $dbname
psql -c "alter system set jit_optimize_above_cost = 1000000000;" -p $port 
$dbname
psql -c "alter system set jit_inline_above_cost = 1000000000;" -p $port $dbname
psql -c "select pg_reload_conf();" -p $port $dbname

for alt_sys in "alter system set jit = 0;" "alter system set jit = 1;" "alter 
system set jit_optimize_above_cost = 0;" "alter system set 
jit_inline_above_cost = 0;"
do
        echo "$alt_sys"
        psql -c "$alt_sys" -p $port $dbname > /dev/null
        psql -c "select pg_Reload_conf();" -p $port $dbname > /dev/null
q=1
        for sql in "select count(*) c from (select a,b,c,d,e,f from hjtbl cross 
join generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a);" 
"select count(*) c from (select a,b,c,d,e,f from hjtbl cross join 
generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a,b);" "select 
count(*) c from (select a,b,c,d,e,f from hjtbl cross join 
generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a,b,c);" "select 
count(*) c from (select a,b,c,d,e,f from hjtbl cross join 
generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a,b,c,d);" 
"select count(*) c from (select a,b,c,d,e,f from hjtbl cross join 
generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a,b,c,d,e);" 
"select count(*) c from (select a,b,c,d,e,f from hjtbl cross join 
generate_Series(1,$nloops) offset 0) h1 inner join hjtbl using(a,b,c,d,e,f);"
        do
                echo $sql > bench.sql
                echo -n "Q$q "
                pgbench -n -f bench.sql -p $port -M prepared -T $seconds 
$dbname | grep latency
                q=$((q+1))
        done
done


Attachment: v1-0001-Use-ExprStates-for-hashing-in-GROUP-BY-and-SubPla.patch
Description: Binary data

Reply via email to