On 16.08.2019 20:17, Pavel Stehule wrote:


pá 16. 8. 2019 v 16:12 odesílatel Konstantin Knizhnik <k.knizh...@postgrespro.ru <mailto:k.knizh...@postgrespro.ru>> napsal:

    I did more investigations of performance of global temp tables
    with shared buffers vs. vanilla (local) temp tables.

    1. Combination of persistent and temporary tables in the same query.

    Preparation:
    create table big(pk bigint primary key, val bigint);
    insert into big values
    (generate_series(1,100000000),generate_series(1,100000000));
    create temp table lt(key bigint, count bigint);
    create global temp table gt(key bigint, count bigint);

    Size of table is about 6Gb, I run this test on desktop with 16GB
    of RAM and postgres with 1Gb shared buffers.
    I run two queries:

    insert into T (select count(*),pk/P as key from big group by key);
    select sum(count) from T;

    where P is (100,10,1) and T is name of temp table (lt or gt).
    The table below contains times of both queries in msec:

    Percent of selected data
        1%
        10%
        100%
    Local temp table
        44610
    90
        47920
    891
        63414
    21612
    Global temp table
        44669
    35
        47939
    298
        59159
    26015


    As you can see, time of insertion in temporary table is almost the
    same
    and time of traversal of temporary table is about twice smaller
    for global temp table
    when it fits in RAM together with persistent table and slightly
    worser when it doesn't fit.



    2. Temporary table only access.
    The same system, but Postgres is configured with
    shared_buffers=10GB, max_parallel_workers = 4,
    max_parallel_workers_per_gather = 4

    Local temp tables:
    create temp table local_temp(x1 bigint, x2 bigint, x3 bigint, x4
    bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9 bigint);
    insert into local_temp values
    (generate_series(1,100000000),0,0,0,0,0,0,0,0);
    select sum(x1) from local_temp;

    Global temp tables:
    create global temporary table global_temp(x1 bigint, x2 bigint, x3
    bigint, x4 bigint, x5 bigint, x6 bigint, x7 bigint, x8 bigint, x9
    bigint);
    insert into global_temp values
    (generate_series(1,100000000),0,0,0,0,0,0,0,0);
    select sum(x1) from global_temp;

    Results (msec):

        Insert
        Select
    Local temp table    37489
        48322
    Global temp table   44358
        3003


    So insertion in local temp table is performed slightly faster but
    select is 16 times slower!

    Conclusion:
    In the assumption then temp table fits in memory, global temp
    tables with shared buffers provides better performance than local
    temp table.
    I didn't consider here global temp tables with local buffers
    because for them results should be similar with local temp tables.


Probably there is not a reason why shared buffers should be slower than local buffers when system is under low load.

access to shared memory is protected by spin locks (are cheap for few processes), so tests in one or few process are not too important (or it is just one side of space)

another topic can be performance on MS Sys - there are stories about not perfect performance of shared memory there.

Regards

Pavel

 One more test which is used to simulate access to temp tables under high load.
I am using "upsert" into temp table in multiple connections.

create global temp table gtemp (x integer primary key, y bigint);

upsert.sql:
insert into gtemp values (random() * 1000000, 0) on conflict(x) do update set y=gtemp.y+1;

pgbench -c 10 -M prepared -T 100 -P 1 -n -f upsert.sql postgres


I failed to find some standard way in pgbech to perform per-session initialization to create local temp table,
so I just insert this code in pgbench code:

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 570cf33..af6a431 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -5994,6 +5994,7 @@ threadRun(void *arg)
                {
                        if ((state[i].con = doConnect()) == NULL)
                                goto done;
+                       executeStatement(state[i].con, "create temp table ltemp(x integer primary key, y bigint)");
                }
        }


Results are the following:
Global temp table: 117526 TPS
Local temp table:   107802 TPS


So even for this workload global temp table with shared buffers are a little bit faster.
I will be pleased if you can propose some other testing scenario.

Reply via email to