Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

Ashutosh Bapat Sun, 21 May 2017 23:33:17 -0700

On Sat, Apr 29, 2017 at 12:37 AM, Robert Haas <robertmh...@gmail.com> wrote:
> On Fri, Apr 28, 2017 at 1:18 AM, Ashutosh Bapat
> <ashutosh.ba...@enterprisedb.com> wrote:
>> For two-way join this works and is fairly straight-forward. I am
>> assuming that A an B are base relations and not joins. But making it
>> work for N-way join is the challenge.
>
> I don't think it's much different, is it?  Anyway, I'm going to
> protest if your algorithm for merging bounds takes any more than
> linear time, regardless of what else we decide.
>
>>> Having said that I think we could make this work, I'm starting to
>>> agree with you that it will add more complexity than it's worth.
>>> Needing to keep track of the type of every partition bound
>>> individually seems like a real nuisance, and it's not likely to win
>>> very often because, realistically, people should and generally will
>>> use the same type for the partitioning column in all of the relevant
>>> tables.  So I'm going to revise my position and say it's fine to just
>>> give up on partitionwise join unless the types match exactly, but I
>>> still think we should try to cover the cases where the bounds don't
>>> match exactly but only 1:1 or 1:0 or 0:1 mappings are needed (iow,
>>> optimizations 1 and 2 from your list of 4).  I agree that ganging
>>> partitions (optimization 4 from your list) is not something to tackle
>>> right now.
>>
>> Good. I will have a more enjoyable vacation now.
>
> Phew, what a relief.  :-)
>
>> Do you still want the patition key type to be out of partition scheme?
>> Keeping it there means we match it only once and save it only at a
>> single place. Otherwise, it will have to be stored in RelOptInfo of
>> the partitioned table and match it for every pair of joining
>> relations.
>
> The only reason for removing things from the PartitionScheme was if
> they didn't need to be consistent across all tables.  Deciding that
> the type is one of the things that has to match means deciding it
> should be in the PartitionScheme, not the RelOptInfo.
>

Here's set of patches rebased on latest head.

I spent some time trying to implement partition-wise join when
partition bounds do not match exactly but there's 1:1, 1:0 or 0:1
mapping between partitions. A WIP patch 0017 is included in the set
for the same. The patch is not complete, it doesn't support range
partitions and needs some bugs to be fixed for list partitions. Also
because of the way it crafts partition bounds for a join, it leaks
memory consumed by partition bounds for every pair of joining
relations. I will work on fixing those issues. That patch is pretty
large now. So, I think we will have to commit it separately on top of
basic partition-wise join implementation. But you will see that it has
minimal changes to the basic partition-wise join code.

I rewrote code handling partition keys on the nullable side of the
join. Now we store partition keys from nullable and non-nullable
relations separately. The partition keys from nullable relations are
matched only when the equality operator is strict. This is explained
in details the comments in match_expr_to_partition_keys() and
build_joinrel_partition_info().

Also please note that since last patch set I have paired the
multi-level partition-wise join support patches with single-level
partition-wise join patches providing corresponding functionality.

[1]
https://www.postgresql.org/message-id/CAFjFpRd9ebX225KhuvYXQRBuk9NrVJfPzHqGPGqpze%2BqvH0xmw%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pg_dp_join_patches_v20.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables

Reply via email to