On 2020-07-16 14:56, Andrey Lepikhov wrote:
On 7/16/20 9:55 AM, Etsuro Fujita wrote:
On Wed, Jul 15, 2020 at 9:02 PM Etsuro Fujita
<etsuro.fuj...@gmail.com> wrote:
On Wed, Jul 15, 2020 at 12:12 AM Alexey Kondratov
<a.kondra...@postgrespro.ru> wrote:
On 2020-07-14 15:27, Ashutosh Bapat wrote:
On Tue, Jul 14, 2020 at 12:48 AM Alexey Kondratov
<a.kondra...@postgrespro.ru> wrote:
Some real-life test queries show, that all single-node queries
aren't
pushed-down to the required node. For example:
SELECT
*
FROM
documents
INNER JOIN users ON documents.user_id = users.id
WHERE
documents.company_id = 5
AND users.company_id = 5;
There are a couple of things happening here
1. the clauses on company_id in WHERE clause are causing partition
pruning. Partition-wise join is disabled with partition pruning
before
PG13.
More precisely, PWJ cannot be applied when there are no matched
partitions on the nullable side due to partition pruning before PG13.
On reflection, I think I was wrong: the limitation applies to PG13,
even with advanced PWJ.
But the join is an inner join, so I think PWJ can still be applied
for
the join.
I think I was wrong in this point as well :-(. PWJ cannot be applied
to the join due to the limitation of the PWJ matching logic. See the
discussion started in [1]. I think the patch in [2] would address
this issue as well, though the patch is under review.
Thanks for sharing the links, Fujita-san.
I think, discussion [1] is little relevant to the current task. Here
we join not on partition attribute and PWJ can't be used at all. Here
we can use push-down join of two foreign relations.
We can analyze baserestrictinfo's of outer and inner RelOptInfo's and
may detect that only one partition from outer and inner need to be
joined.
Next, we will create joinrel from RelOptInfo's of these partitions and
replace joinrel of partitioned tables. But it is only rough outline of
a possible solution...
I was a bit skeptical after eyeballing the thread [1], but still tried
v3 patch with the current master and my test setup. Surprisingly, it
just worked, though it isn't clear for me how. With this patch
aforementioned simple join is completely pushed down to the foreign
server. And speedup is approximately the same (~3 times) as when
required partitions are explicitly used in the query.
As a side-effected it also affected join + aggregate queries like:
SELECT
user_id,
count(*) AS documents_count
FROM
documents
INNER JOIN users ON documents.user_id = users.id
WHERE
documents.company_id = 5
AND users.company_id = 5
GROUP BY
user_id;
With patch it is executed as:
GroupAggregate
Group Key: documents.user_id
-> Sort
Sort Key: documents.user_id
-> Foreign Scan
Relations: (documents_node2 documents)
INNER JOIN (users_node2 users)
Without patch its plan was:
GroupAggregate
Group Key: documents.user_id
-> Sort
Sort Key: documents.user_id
-> Hash Join
Hash Cond: (documents.user_id = users.id)
-> Foreign Scan on documents_node2 documents
-> Hash
-> Foreign Scan on users_node2 users
I cannot say that it is most efficient plan in that case, since the
entire query could be pushed down to the foreign server, but still it
gives a 5-10% speedup on my setup.
Regards
--
Alexey Kondratov
Postgres Professional https://www.postgrespro.com
Russian Postgres Company