Re: [HACKERS] path toward faster partition pruning

Amit Langote Tue, 06 Feb 2018 01:56:15 -0800

On 2018/02/03 6:05, Robert Haas wrote:
> On Fri, Feb 2, 2018 at 9:33 AM, Robert Haas <robertmh...@gmail.com> wrote:
>>> Updated set of patches attached (patches 0002 onward mostly unchanged,
>>> except I incorporated the delta patch posted by David upthread).
>>
>> Committed 0001.  Thanks.
> 
> Some preliminary thoughts...


Thanks for the review.

> Regarding 0002, I can't help noticing that this adds a LOT of new code
> to partition.c.  With 0002 applied, it climbs into the top 2% of all
> ".c" files in terms of lines of code.  It seems to me, though, that
> maybe it would make sense to instead add all of this code to some new
> file .c file, e.g. src/backend/optimizer/util/partprune.c.  I realize
> that's a little awkward in this case because we're hoping to use this
> code at runtime and not just in the optimizer, but I don't have a
> better idea.  Using src/backend/catalog as a dumping-ground for code
> that doesn't have a clear-cut place to live is not a superior
> alternative, for sure.

Agreed.  partition.c has gotten quite big and actually more than half of
the code that this patch adds really seems to belong outside of it.

> And it seems to me that the code you're adding
> here is really quite similar to what we've already got in that
> directory -- for example, predtest.c, which currently does partition
> pruning, lives there; so does clauses.c, whose evaluate_expr facility
> this patch wants to use; so does relnode.c, which the other patches
> modify; and in general this looks an awful lot like other optimizer
> logic that decomposes clauses.  I'm open to other suggestions but I
> don't think adding all of this directly into partition.c is a good
> plan.

Agreed.

A partprune.c in the optimizer's util directory seems like a good place.

> If we do add a new file for this code, the header comment for that
> file would be a good place to write an overall explanation of this new
> facility.  The individual bits and pieces seem to have good comments
> but an overall explanation of what's going on here seems to be
> lacking.

OK, I will add such a comment.

> It doesn't seem good that get_partitions_from_clauses requires us to
> reopen the relation.  I'm going to give my standard review feedback
> any time someone injects additional relation_open or heap_open calls
> into a patch: please look for a way to piggyback on one of the places
> that already has the relation open instead of adding code to re-open
> it elsewhere.  Reopening it is not entirely free, and, especially when
> NoLock is used, makes it hard to tell whether we're doing the locking
> correctly.  Note that we've already got things like
> set_relation_partition_info (which copies the bounds) and
> set_baserel_partition_key_exprs (which copies, with some partitioning,
> the partitioning expressions) and also find_partition_scheme, but
> instead of using that existing data from the RelOptInfo, this patch is
> digging it directly out of the relcache entry, which doesn't seem
> great.

OK, I have to admit that the quoted heap_open wasn't quite well thought
out and I'm almost sure that everything should be fine with the
information that set_relation_partition_info() fills in the RelOptInfo.
I'm now going through the patch to try to figure out how to make that work.

> The changes to set_append_rel_pathlist probably make it slower in the
> case where rte->relkind != RELKIND_PARTITIONED_TABLE.  We build a
> whole new list that we don't really need.  How about just keeping the
> (appinfo->parent_relid != parentRTindex) test in the loop and setting
> rel_appinfos to either root->append_rel_list or
> rel->live_part_appinfos as appropriate?

That's certainly better.  Also in set_append_rel_size.

> I understand why COLLATION_MATCH think that a collation OID match is
> OK, but why is InvalidOid also OK?  Can you add a comment?  Maybe some
> test cases, too?

partcollid == InvalidOid means the partition key is of uncollatable type,
so further checking the collation is unnecessary.

There is a test in partition_prune.sql that covers the failure to prune
when collations don't match for a text partition key.

> How fast is this patch these days, compared with the current approach?
>  It would be good to test both when nearly all of the partitions are
> pruned and when almost none of the partitions are pruned.

I will include some performance numbers in my next email, which hopefully
should not be later than Friday this week.

Thanks,
Amit

Re: [HACKERS] path toward faster partition pruning

Reply via email to