On 2020-07-16 14:56, Andrey Lepikhov wrote:
On 7/16/20 9:55 AM, Etsuro Fujita wrote:
On Wed, Jul 15, 2020 at 9:02 PM Etsuro Fujita <etsuro.fuj...@gmail.com> wrote:
On Wed, Jul 15, 2020 at 12:12 AM Alexey Kondratov
<a.kondra...@postgrespro.ru> wrote:
On 2020-07-14 15:27, Ashutosh Bapat wrote:
On Tue, Jul 14, 2020 at 12:48 AM Alexey Kondratov
<a.kondra...@postgrespro.ru> wrote:
Some real-life test queries show, that all single-node queries aren't
pushed-down to the required node. For example:

SELECT
      *
FROM
      documents
      INNER JOIN users ON documents.user_id = users.id
WHERE
      documents.company_id = 5
      AND users.company_id = 5;

There are a couple of things happening here
1. the clauses on company_id in WHERE clause are causing partition
pruning. Partition-wise join is disabled with partition pruning before
PG13.

More precisely, PWJ cannot be applied when there are no matched
partitions on the nullable side due to partition pruning before PG13.

On reflection, I think I was wrong: the limitation applies to PG13,
even with advanced PWJ.

But the join is an inner join, so I think PWJ can still be applied for
the join.

I think I was wrong in this point as well :-(.  PWJ cannot be applied
to the join due to the limitation of the PWJ matching logic.  See the
discussion started in [1].  I think the patch in [2] would address
this issue as well, though the patch is under review.


Thanks for sharing the links, Fujita-san.


I think, discussion [1] is little relevant to the current task. Here
we join not on partition attribute and PWJ can't be used at all. Here
we can use push-down join of two foreign relations.
We can analyze baserestrictinfo's of outer and inner RelOptInfo's and
may detect that only one partition from outer and inner need to be
joined.
Next, we will create joinrel from RelOptInfo's of these partitions and
replace joinrel of partitioned tables. But it is only rough outline of
a possible solution...


I was a bit skeptical after eyeballing the thread [1], but still tried v3 patch with the current master and my test setup. Surprisingly, it just worked, though it isn't clear for me how. With this patch aforementioned simple join is completely pushed down to the foreign server. And speedup is approximately the same (~3 times) as when required partitions are explicitly used in the query.

As a side-effected it also affected join + aggregate queries like:

SELECT
    user_id,
    count(*) AS documents_count
FROM
    documents
    INNER JOIN users ON documents.user_id = users.id
WHERE
    documents.company_id = 5
    AND users.company_id = 5
GROUP BY
    user_id;

With patch it is executed as:

 GroupAggregate
   Group Key: documents.user_id
   ->  Sort
         Sort Key: documents.user_id
         ->  Foreign Scan
               Relations: (documents_node2 documents)
                           INNER JOIN (users_node2 users)

Without patch its plan was:

 GroupAggregate
   Group Key: documents.user_id
   ->  Sort
         Sort Key: documents.user_id
         ->  Hash Join
               Hash Cond: (documents.user_id = users.id)
               ->  Foreign Scan on documents_node2 documents
               ->  Hash
                     ->  Foreign Scan on users_node2 users

I cannot say that it is most efficient plan in that case, since the entire query could be pushed down to the foreign server, but still it gives a 5-10% speedup on my setup.


Regards
--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company


Reply via email to