On 7 September 2018 at 19:29, David Rowley <david.row...@2ndquadrant.com> wrote: > While reviewing some other patches to improve partitioning performance > I noticed that one of the loops in ExecFindInitialMatchingSubPlans() > could be coded a bit more efficiently. The current code loops over > all the original subplans checking if the subplan is newly pruned, if > it is, the code sets the new_subplan_indexes array element to -1, else > it sets it assigns the new subplan index. This can be done more > efficiently if we make this array 1-based and initialise the whole > thing to 0 then just loop over the non-pruned subplans instead of all > subplans. Pruning all but 1 subplan is quite common.
I was looking at this again and I realised that we can completely skip the re-sequence of the subplan map when we're not going to perform any further pruning during execution. We possibly could also not make a copy of the subplan_map in this case at all in ExecCreatePartitionPruneState(), and just take the planner's copy verbatim as we do for the subpart_map. I was just unable to see any performance gains from doing this, so I've just left it for now. Currently, this improves performance about 2% with prepared queries and 300 partitions. Patched: tps = 5169.169452 (excluding connections establishing) tps = 5155.914286 (excluding connections establishing) Unpatched: tps = 5059.511370 (excluding connections establishing) tps = 5082.851062 (excluding connections establishing) However with other patches to remove partitioning bottlenecks in the executor, the TPS goes to about 25,000, so 2% becomes 10%, which seems more meaningful. I've attached an updated patch which skips the re-sequence work when doing that is not required for anything. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
v2-0001-Improve-performance-of-run-time-partition-pruning.patch
Description: Binary data