Bharath-san, all,

Hmm, I didn't experience performance degradation on my poor-man's Linux VM (4 
CPU, 4 GB RAM, HDD)...

[benchmark preparation]
autovacuum = off
shared_buffers = 1GB
checkpoint_timeout = 1h
max_wal_size = 8GB
min_wal_size = 8GB
(other settings to enable parallelism)
CREATE UNLOGGED TABLE a (c char(1100));
INSERT INTO a SELECT i FROM generate_series(1, 300000) i;
(the table size is 335 MB)

[benchmark]
CREATE TABLE b AS SELECT * FROM a;
DROP TABLE a;
CHECKPOINT;
(measure only CTAS)


[results]
parallel_leader_participation = off
  workers  time(ms)
  0  3921
  2  3290
  4  3132
parallel_leader_participation = on
  workers  time(ms)
  2  3266
  4  3247


Although this should be a controversial and may be crazy idea, the following 
change brought 4-11% speedup.  This is because I thought parallel workers might 
contend for WAL flush as a result of them using the limited ring buffer and 
flushing dirty buffers when the ring buffer is filled.  Can we take advantage 
of this?

[GetBulkInsertState]
/*  bistate->strategy = GetAccessStrategy(BAS_BULKWRITE);*/
    bistate->strategy = NULL;


[results]
parallel_leader_participation = off
  workers  time(ms)
  0  3695  (5% reduction)
  2  3135  (4% reduction)
  4  2767  (11% reduction)


Regards
Takayuki Tsunakawa

Reply via email to