On Tue, Jul 25, 2017 at 9:39 PM, Dilip Kumar <dilipbal...@gmail.com> wrote: > On Tue, Jul 25, 2017 at 8:59 PM, Robert Haas <robertmh...@gmail.com> wrote: >> On Tue, Jul 25, 2017 at 1:31 AM, Rafia Sabih >> <rafia.sa...@enterprisedb.com> wrote: >>> - other queries show a good 20-30% improvement in performance. Performance >>> numbers are as follows, >>> >>> Query| un_part_head (seconds) | part_head (seconds) | part_patch (seconds) | >>> 3 | 76 |127 | 88 | >>> 4 |17 | 244 | 41 | >>> 5 | 52 | 123 | 84 | >>> 7 | 73 | 134 | 103 | >>> 10 | 67 | 111 | 89 | >>> 12 | 53 | 114 | 99 | >>> 18 | 447 | 709 | 551 | >> >> Hmm. This certainly shows that benefit of the patch, although it's >> rather sad that we're still slower than if we hadn't partitioned the >> data in the first place. Why does partitioning hurt performance so >> much? > > I was analysing some of the plans (without partition and with > partition), Seems like one of the reasons of performing worse with the > partitioned table is that we can not use an index on the partitioned > table. > > Q4 is taking 17s without partition whereas it's taking 244s with partition. > > Now if we analyze the plan > > Without partition, it can use parameterize index scan on lineitem > table which is really big in size. But with partition, it has to scan > this table completely. > > -> Nested Loop Semi Join > -> Parallel Bitmap Heap Scan on orders > -> Bitmap Index Scan on > idx_orders_orderdate (cost=0.00..24378.88 r > -> Index Scan using idx_lineitem_orderkey on > lineitem (cost=0.57..29.29 rows=105 width=8) (actual > time=0.031..0.031 rows=1 loops=1122364) > Index Cond: (l_orderkey = > orders.o_orderkey) > Filter: (l_commitdate < l_receiptdate) > Rows Removed by Filter: 1 >
If the partitions have the same indexes as the unpartitioned table, planner manages to create parameterized plans for each partition and thus parameterized plan for the whole partitioned table. Do we have same indexes on unpartitioned table and each of the partitions? The difference between the two cases is the parameterized path on an unpartitioned table scans only one index whereas that on the partitioned table scans the indexes on all the partitions. My guess is the planner thinks those many scans are costlier than hash/merge join and chooses those strategies over parameterized nest loop join. In case of partition-wise join, only one index on the inner partition is involved and thus partition-wise join picks up parameterized nest loop join. Notice, that this index is much smaller than the index on the partitioned table, so the index scan will be a bit faster. But only a bit, since the depth of the index doesn't increase linearly with the size of index. Rrun-time partition pruning will improve performance even without partition-wise join since partition pruning will be able to eliminate all but one partition and only one index needs to be scanned. If planner is smart enough to cost that effectively, it will choose parameterized nest loop join for partitioned table thus improving the performance similar to unpartitioned case. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers