ne 18. 8. 2019 v 9:02 odesílatel Konstantin Knizhnik < k.knizh...@postgrespro.ru> napsal:
> > > On 16.08.2019 20:17, Pavel Stehule wrote: > > > > pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik < > k.knizh...@postgrespro.ru> napsal: > >> I did more investigations of performance of global temp tables with >> shared buffers vs. vanilla (local) temp tables. >> >> 1. Combination of persistent and temporary tables in the same query. >> >> Preparation: >> create table big(pk bigint primary key, val bigint); >> insert into big values >> (generate_series(1,100000000),generate_series(1,100000000)); >> create temp table lt(key bigint, count bigint); >> create global temp table gt(key bigint, count bigint); >> >> Size of table is about 6Gb, I run this test on desktop with 16GB of RAM >> and postgres with 1Gb shared buffers. >> I run two queries: >> >> insert into T (select count(*),pk/P as key from big group by key); >> select sum(count) from T; >> >> where P is (100,10,1) and T is name of temp table (lt or gt). >> The table below contains times of both queries in msec: >> >> Percent of selected data >> 1% >> 10% >> 100% >> Local temp table >> 44610 >> 90 >> 47920 >> 891 >> 63414 >> 21612 >> Global temp table >> 44669 >> 35 >> 47939 >> 298 >> 59159 >> 26015 >> >> As you can see, time of insertion in temporary table is almost the same >> and time of traversal of temporary table is about twice smaller for >> global temp table >> when it fits in RAM together with persistent table and slightly worser >> when it doesn't fit. >> >> >> >> 2. Temporary table only access. >> The same system, but Postgres is configured with shared_buffers=10GB, >> max_parallel_workers = 4, max_parallel_workers_per_gather = 4 >> >> Local temp tables: >> create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4 bigint, >> x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint); >> insert into local_temp values >> (generate_series(1,100000000),0,0,0,0,0,0,0,0); >> select sum(x1) from local_temp; >> >> Global temp tables: >> create global temporary table global_temp(x1 bigint, x2 bigint, x3 >> bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint); >> insert into global_temp values >> (generate_series(1,100000000),0,0,0,0,0,0,0,0); >> select sum(x1) from global_temp; >> >> Results (msec): >> >> Insert >> Select >> Local temp table 37489 >> 48322 >> Global temp table 44358 >> 3003 >> >> So insertion in local temp table is performed slightly faster but select >> is 16 times slower! >> >> Conclusion: >> In the assumption then temp table fits in memory, global temp tables with >> shared buffers provides better performance than local temp table. >> I didn't consider here global temp tables with local buffers because for >> them results should be similar with local temp tables. >> > > Probably there is not a reason why shared buffers should be slower than > local buffers when system is under low load. > > access to shared memory is protected by spin locks (are cheap for few > processes), so tests in one or few process are not too important (or it is > just one side of space) > > another topic can be performance on MS Sys - there are stories about not > perfect performance of shared memory there. > > Regards > > Pavel > > One more test which is used to simulate access to temp tables under high > load. > I am using "upsert" into temp table in multiple connections. > > create global temp table gtemp (x integer primary key, y bigint); > > upsert.sql: > insert into gtemp values (random() * 1000000, 0) on conflict(x) do update > set y=gtemp.y+1; > > pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres > > > I failed to find some standard way in pgbech to perform per-session > initialization to create local temp table, > so I just insert this code in pgbench code: > > diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c > index 570cf33..af6a431 100644 > --- a/src/bin/pgbench/pgbench.c > +++ b/src/bin/pgbench/pgbench.c > @@ -5994,6 +5994,7 @@ threadRun(void *arg) > { > if ((state[i].con = doConnect()) == NULL) > goto done; > + executeStatement(state[i].con, "create temp table > ltemp(x integer primary key, y bigint)"); > } > } > > > Results are the following: > Global temp table: 117526 TPS > Local temp table: 107802 TPS > > > So even for this workload global temp table with shared buffers are a > little bit faster. > I will be pleased if you can propose some other testing scenario. > please, try to increase number of connections. Regards Pavel