Re: Multi Inserts in CREATE TABLE AS - revived patch

Bharath Rupireddy Thu, 05 Nov 2020 03:04:25 -0800

On Tue, Nov 3, 2020 at 4:54 PM Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:
>
> Use case 1- 100mn tuples, 2 integer columns, exec time in sec:
> HEAD: 131.507 when the select part is not parallel, 128.832 when the
select part is parallel
> Patch: 98.925 when the select part is not parallel, 52.901 when the
select part is parallel
>
> Use case 2- 10mn tuples, 4 integer and 6 text columns, exec time in sec:
> HEAD: 76.801 when the select part is not parallel, 66.074 when the select
part is parallel
> Patch: 74.083 when the select part is not parallel, 57.739 when the
select part is parallel
>


I did some more testing with v1 patch: execution time is in seconds, each
test is run 2 times, with custom configuration [1].

Use case 3: 1 int and 1 text column. each row size 129 bytes, size of 1
text column 101 bytes, number of rows 100million, size of heap file 12.9GB.
HEAD: 253.227, 259.575
Patch: 177.921, 174.196

We get better performance 1.4X with the patch.

Use case 4: 1 int and 30 text columns. each row size 28108 bytes, size of 1
text column 932 bytes, number of rows 10000, size of heap file 281.08MB.
HEAD: 222.812, 218.837
Patch: 222.492, 222.295

We don't see much difference with and without patch. Each time only 2
tuples(2*28108 = 56216 bytes < MAX_MULTI_INSERT_BUFFERED_BYTES(65535
bytes)) are buffered and flushed.

Use case 5: 1 int and 75 text columns. each row size 70228 bytes, size of 1
text column 932 bytes, number of rows 10000, size of heap file 702.28MB.
HEAD: 554.709, 562.745
Patch: 553.378, 560.370

We don't see much difference with and without patch. Since each row
size(70228 bytes) is bigger than the MAX_MULTI_INSERT_BUFFERED_BYTES(65535
bytes), multi insert code is not picked, each single row is inserted with
table_tuple_insert() itself.

Use case 6: 1 int and 1 text column. each row size 9205 bytes, size of 1
text column 9173 bytes, number of rows 10000, size of heap file 92.05MB.
HEAD: 70.583, 70251
Patch: 72.633, 73.521

We see 2-3 seconds more with patch. When I intentionally made the computed
tuple size to 0(sz =0) after GetTupleSize(), which means the single inserts
happen, the results are 70.364, 70.406. Looks like this 2-3 seconds extra
time is due to the multi insert code and happens for with this use case
only. And I think this should not be a problem as the difference is not
huge.

+    sz = GetTupleSize(slot, MAX_MULTI_INSERT_BUFFERED_BYTES);
+
*+.  sz = 0;*
+
+    /* In case the computed tuple size is 0, we go for single inserts. */
+    if (sz != 0)
+    {

[1] - The postgresql.conf used:
shared_buffers = 40GB
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Multi Inserts in CREATE TABLE AS - revived patch

Reply via email to