Re: Extend more usecase for planning time partition pruning and init partition pruning.

Andy Fan Fri, 19 Feb 2021 02:03:59 -0800

On Mon, Feb 8, 2021 at 3:43 PM Andy Fan <zhihui.fan1...@gmail.com> wrote:


>
>
> On Mon, Jan 25, 2021 at 10:21 AM Andy Fan <zhihui.fan1...@gmail.com>
> wrote:
>
>>
>>
>> On Sun, Jan 24, 2021 at 6:34 PM Andy Fan <zhihui.fan1...@gmail.com>
>> wrote:
>>
>>> Hi:
>>>
>>>  I recently found a use case like this.  SELECT * FROM p, q WHERE
>>> p.partkey =
>>>  q.colx AND (q.colx = $1 OR q.colx = $2); Then we can't do either
>>> planning time
>>>  partition prune or init partition prune.  Even though we have run-time
>>>  partition pruning work at last, it is too late in some cases since we
>>> have
>>>  to init all the plan nodes in advance.  In my case, there are 10+
>>>  partitioned relation in one query and the execution time is short, so
>>> the
>>>  init plan a lot of plan nodes cares a lot.
>>>
>>> The attached patches fix this issue. It just get the "p.partkey = q.colx"
>>> case in root->eq_classes or rel->joinlist (outer join), and then check
>>> if there
>>> is some baserestrictinfo in another relation which can be used for
>>> partition
>>> pruning. To make the things easier, both partkey and colx must be Var
>>> expression in implementation.
>>>
>>> - v1-0001-Make-some-static-functions-as-extern-and-extend-C.patch
>>>
>>> Just some existing refactoring and extending ChangeVarNodes to be able
>>> to change var->attno.
>>>
>>> - v1-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
>>>
>>> Do the real job.
>>>
>>> Thought?
>>>
>>>
>>>
>>> --
>>> Best Regards
>>> Andy Fan (https://www.aliyun.com/)
>>>
>>
>>
>> Some results from this patch.
>>
>> create table p (a int, b int, c character varying(8)) partition by
>> list(c);
>> create table p1  partition of p for values in ('000001');
>> create table p2  partition of p for values in ('000002');
>> create table p3  partition of p for values in ('000003');
>> create table q (a int, c character varying(8), b int) partition by
>> list(c);
>> create table q1  partition of q for values in ('000001');
>> create table q2  partition of q for values in ('000002');
>> create table q3  partition of q for values in ('000003');
>>
>> Before the patch:
>> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
>> and q.c > '000002';
>>                      QUERY PLAN
>> ----------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Append
>>          ->  Seq Scan on p1 p_1
>>          ->  Seq Scan on p2 p_2
>>          ->  Seq Scan on p3 p_3
>>    ->  Hash
>>          ->  Seq Scan on q3 q
>>                Filter: ((c)::text > '000002'::text)
>> (9 rows)
>>
>> After the patch:
>>
>>                      QUERY PLAN
>> ----------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Seq Scan on p3 p
>>    ->  Hash
>>          ->  Seq Scan on q3 q
>>                Filter: ((c)::text > '000002'::text)
>> (6 rows)
>>
>>
>> Before the patch:
>> postgres=# explain (costs off) select * from p inner join q on p.c = q.c
>> and (q.c = '000002' or q.c = '000001');
>>                                          QUERY PLAN
>>
>> --------------------------------------------------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Append
>>          ->  Seq Scan on p1 p_1
>>          ->  Seq Scan on p2 p_2
>>          ->  Seq Scan on p3 p_3
>>    ->  Hash
>>          ->  Append
>>                ->  Seq Scan on q1 q_1
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>>                ->  Seq Scan on q2 q_2
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>> (12 rows)
>>
>> After the patch:
>>                                          QUERY PLAN
>>
>> --------------------------------------------------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Append
>>          ->  Seq Scan on p1 p_1
>>          ->  Seq Scan on p2 p_2
>>    ->  Hash
>>          ->  Append
>>                ->  Seq Scan on q1 q_1
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>>                ->  Seq Scan on q2 q_2
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>> (11 rows)
>>
>> Before the patch:
>> postgres=# explain (costs off) select * from p left join q on p.c = q.c
>> where (q.c = '000002' or q.c = '000001');
>>                                          QUERY PLAN
>>
>> --------------------------------------------------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Append
>>          ->  Seq Scan on p1 p_1
>>          ->  Seq Scan on p2 p_2
>>          ->  Seq Scan on p3 p_3
>>    ->  Hash
>>          ->  Append
>>                ->  Seq Scan on q1 q_1
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>>                ->  Seq Scan on q2 q_2
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>> (12 rows)
>>
>> After the patch:
>>                                          QUERY PLAN
>>
>> --------------------------------------------------------------------------------------------
>>  Hash Join
>>    Hash Cond: ((p.c)::text = (q.c)::text)
>>    ->  Append
>>          ->  Seq Scan on p1 p_1
>>          ->  Seq Scan on p2 p_2
>>    ->  Hash
>>          ->  Append
>>                ->  Seq Scan on q1 q_1
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>>                ->  Seq Scan on q2 q_2
>>                      Filter: (((c)::text = '000002'::text) OR ((c)::text
>> = '000001'::text))
>> (11 rows)
>>
>> --
>> Best Regards
>> Andy Fan (https://www.aliyun.com/)
>>
>
>
> Here is a performance test regarding this patch.  In the following simple
> case,
> we can get 3x faster than before.
>
> create table p (a int, b int, c int) partition by list(c);
> select 'create table p_'||i||' partition of p for values in (' || i ||
> ');' from generate_series(1, 100)i; \gexec
> insert into p select i, i, i from generate_series(1, 100)i;
> create table m as select * from p;
> analyze m;
> analyze p;
>
> test sql:  select * from m, p where m.c = p.c and m.c in (3, 10);
>
> With this patch:  1.1ms
> Without this patch: 3.4ms
>
> I'm happy with the result and the implementation,  I have add this into
> commitfest https://commitfest.postgresql.org/32/2975/
>
> Thanks.
>
> --
> Best Regards
> Andy Fan (https://www.aliyun.com/)
>

Rebase to the current latest commit 678d0e239b.


-- 
Best Regards
Andy Fan (https://www.aliyun.com/)

v2-0002-Build-some-implied-pruning-quals-to-extend-the-us.patch
Description: Binary data

v2-0001-Make-some-static-functions-as-extern-and-extend-C.patch
Description: Binary data

Re: Extend more usecase for planning time partition pruning and init partition pruning.

Reply via email to