On Tue, Nov 3, 2020 at 4:54 PM Bharath Rupireddy < bharath.rupireddyforpostg...@gmail.com> wrote: > > Use case 1- 100mn tuples, 2 integer columns, exec time in sec: > HEAD: 131.507 when the select part is not parallel, 128.832 when the select part is parallel > Patch: 98.925 when the select part is not parallel, 52.901 when the select part is parallel > > Use case 2- 10mn tuples, 4 integer and 6 text columns, exec time in sec: > HEAD: 76.801 when the select part is not parallel, 66.074 when the select part is parallel > Patch: 74.083 when the select part is not parallel, 57.739 when the select part is parallel >
I did some more testing with v1 patch: execution time is in seconds, each test is run 2 times, with custom configuration [1]. Use case 3: 1 int and 1 text column. each row size 129 bytes, size of 1 text column 101 bytes, number of rows 100million, size of heap file 12.9GB. HEAD: 253.227, 259.575 Patch: 177.921, 174.196 We get better performance 1.4X with the patch. Use case 4: 1 int and 30 text columns. each row size 28108 bytes, size of 1 text column 932 bytes, number of rows 10000, size of heap file 281.08MB. HEAD: 222.812, 218.837 Patch: 222.492, 222.295 We don't see much difference with and without patch. Each time only 2 tuples(2*28108 = 56216 bytes < MAX_MULTI_INSERT_BUFFERED_BYTES(65535 bytes)) are buffered and flushed. Use case 5: 1 int and 75 text columns. each row size 70228 bytes, size of 1 text column 932 bytes, number of rows 10000, size of heap file 702.28MB. HEAD: 554.709, 562.745 Patch: 553.378, 560.370 We don't see much difference with and without patch. Since each row size(70228 bytes) is bigger than the MAX_MULTI_INSERT_BUFFERED_BYTES(65535 bytes), multi insert code is not picked, each single row is inserted with table_tuple_insert() itself. Use case 6: 1 int and 1 text column. each row size 9205 bytes, size of 1 text column 9173 bytes, number of rows 10000, size of heap file 92.05MB. HEAD: 70.583, 70251 Patch: 72.633, 73.521 We see 2-3 seconds more with patch. When I intentionally made the computed tuple size to 0(sz =0) after GetTupleSize(), which means the single inserts happen, the results are 70.364, 70.406. Looks like this 2-3 seconds extra time is due to the multi insert code and happens for with this use case only. And I think this should not be a problem as the difference is not huge. + sz = GetTupleSize(slot, MAX_MULTI_INSERT_BUFFERED_BYTES); + *+. sz = 0;* + + /* In case the computed tuple size is 0, we go for single inserts. */ + if (sz != 0) + { [1] - The postgresql.conf used: shared_buffers = 40GB synchronous_commit = off checkpoint_timeout = 1d max_wal_size = 24GB min_wal_size = 15GB autovacuum = off With Regards, Bharath Rupireddy. EnterpriseDB: http://www.enterprisedb.com