On 27 July 2018 at 04:19, David Rowley <david.row...@2ndquadrant.com> wrote: > I've attached a delta of the changes I made since your v2 delta and > also a complete updated patch.
I did a very quick performance test of this patch on an AWS m5d.large instance with fsync=off. The test setup is the same as is described in my initial email on this thread. The test compares the performance of INSERTs into a partitioned table with 10k partitions compared to a non-partitioned table. Patched with v2 patch on master@39d51fe87 -- partitioned $ pgbench -n -T 60 -f partbench_insert.sql postgres transaction type: partbench_insert.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 1063764 latency average = 0.056 ms tps = 17729.375930 (including connections establishing) tps = 17729.855215 (excluding connections establishing) -- non-partitioned $ pgbench -n -T 60 -f partbench__insert.sql postgres transaction type: partbench__insert.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 1147273 latency average = 0.052 ms tps = 19121.194366 (including connections establishing) tps = 19121.695469 (excluding connections establishing) Here we're within 92% of the non-partitioned performance. Looking back at the first email in this thread where I tested the v1 patch, we were within 82% with: -- partitioned tps = 11001.602377 (excluding connections establishing) -- non-partitioned tps = 13354.656163 (excluding connections establishing) Again, same as with the v1 test, the v2 test was done with the locking of all partitions removed with: diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c index d7b18f52ed..6223c62094 100644 --- a/src/backend/executor/execPartition.c +++ b/src/backend/executor/execPartition.c @@ -80,9 +80,6 @@ ExecSetupPartitionTupleRouting(ModifyTableState *mtstate, Relation rel) PartitionTupleRouting *proute; ModifyTable *node = mtstate ? (ModifyTable *) mtstate->ps.plan : NULL; - /* Lock all the partitions. */ - (void) find_all_inheritors(RelationGetRelid(rel), RowExclusiveLock, NULL); - /* * Here we attempt to expend as little effort as possible in setting up * the PartitionTupleRouting. Each partition's ResultRelInfo is built @@ -442,7 +439,7 @@ ExecInitPartitionInfo(ModifyTableState *mtstate, * We locked all the partitions in ExecSetupPartitionTupleRouting * including the leaf partitions. */ - partrel = heap_open(partoid, NoLock); + partrel = heap_open(partoid, RowExclusiveLock); /* * Keep ResultRelInfo and other information for this partition in the Again, the reduce locking is not meant for commit for this patch. Changing the locking will require a discussion on its own thread. And just for fun, the unpatched performance on the partitioned table: ubuntu@ip-10-0-0-33:~$ pgbench -n -T 60 -f partbench_insert.sql postgres transaction type: partbench_insert.sql scaling factor: 1 query mode: simple number of clients: 1 number of threads: 1 duration: 60 s number of transactions actually processed: 5751 latency average = 10.434 ms tps = 95.836052 (including connections establishing) tps = 95.838490 (excluding connections establishing) (185x increase) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services