On Fri, Feb 2, 2018 at 9:33 AM, Robert Haas <robertmh...@gmail.com> wrote: >> Updated set of patches attached (patches 0002 onward mostly unchanged, >> except I incorporated the delta patch posted by David upthread). > > Committed 0001. Thanks.
Some preliminary thoughts... Regarding 0002, I can't help noticing that this adds a LOT of new code to partition.c. With 0002 applied, it climbs into the top 2% of all ".c" files in terms of lines of code. It seems to me, though, that maybe it would make sense to instead add all of this code to some new file .c file, e.g. src/backend/optimizer/util/partprune.c. I realize that's a little awkward in this case because we're hoping to use this code at runtime and not just in the optimizer, but I don't have a better idea. Using src/backend/catalog as a dumping-ground for code that doesn't have a clear-cut place to live is not a superior alternative, for sure. And it seems to me that the code you're adding here is really quite similar to what we've already got in that directory -- for example, predtest.c, which currently does partition pruning, lives there; so does clauses.c, whose evaluate_expr facility this patch wants to use; so does relnode.c, which the other patches modify; and in general this looks an awful lot like other optimizer logic that decomposes clauses. I'm open to other suggestions but I don't think adding all of this directly into partition.c is a good plan. If we do add a new file for this code, the header comment for that file would be a good place to write an overall explanation of this new facility. The individual bits and pieces seem to have good comments but an overall explanation of what's going on here seems to be lacking. It doesn't seem good that get_partitions_from_clauses requires us to reopen the relation. I'm going to give my standard review feedback any time someone injects additional relation_open or heap_open calls into a patch: please look for a way to piggyback on one of the places that already has the relation open instead of adding code to re-open it elsewhere. Reopening it is not entirely free, and, especially when NoLock is used, makes it hard to tell whether we're doing the locking correctly. Note that we've already got things like set_relation_partition_info (which copies the bounds) and set_baserel_partition_key_exprs (which copies, with some partitioning, the partitioning expressions) and also find_partition_scheme, but instead of using that existing data from the RelOptInfo, this patch is digging it directly out of the relcache entry, which doesn't seem great. The changes to set_append_rel_pathlist probably make it slower in the case where rte->relkind != RELKIND_PARTITIONED_TABLE. We build a whole new list that we don't really need. How about just keeping the (appinfo->parent_relid != parentRTindex) test in the loop and setting rel_appinfos to either root->append_rel_list or rel->live_part_appinfos as appropriate? I understand why COLLATION_MATCH think that a collation OID match is OK, but why is InvalidOid also OK? Can you add a comment? Maybe some test cases, too? How fast is this patch these days, compared with the current approach? It would be good to test both when nearly all of the partitions are pruned and when almost none of the partitions are pruned. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company