On Sat, Dec 11, 2021 at 7:31 AM 曾文旌(义从) <wenjing....@alibaba-inc.com> wrote:

>
>
> ------------------原始邮件 ------------------
> *发件人:*Tomas Vondra <tomas.von...@enterprisedb.com>
> *发送时间:*Wed Dec 8 11:26:35 2021
> *收件人:*曾文旌(义从) <wenjing....@alibaba-inc.com>, shawn wang <
> shawn.wang...@gmail.com>, ggys...@gmail.com <ggys...@gmail.com>,
> PostgreSQL Hackers <pgsql-hack...@postgresql.org>
> *抄送:*wjzeng <wjzeng2...@gmail.com>
> *主题:*Re: 回复:Re: Is it worth pushing conditions to sublink/subplan?
>
>> Hi,
>>
>> On 12/7/21 10:44, 曾文旌(义从) wrote:
>> > Hi Hackers
>> >
>> > For my previous proposal, I developed a prototype and passed
>> > regression testing. It works similarly to subquery's qual pushdown.
>> > We know that sublink expands at the beginning of each level of
>> > query. At this stage, The query's conditions and equivalence classes
>> > are not processed. But after generate_base_implied_equalities the
>> > conditions are processed,  which is why qual can push down to
>> > subquery but sublink not.
>> >
>> > My POC implementation chose to delay the sublink expansion in the
>> > SELECT clause (targetList) and where clause. Specifically, it is
>> > delayed after generate_base_implied_equalities. Thus, the equivalent
>> > conditions already established in the Up level query can be easily
>> > obtained in the sublink expansion process (make_subplan).
>> >
>> > For example, if the up level query has a.id = 10 and the sublink
>> > query has a.id = b.id, then we get b.id = 10 and push it down to the
>> > sublink quey. If b is a partitioned table and is partitioned by id,
>> > then a large number of unrelated subpartitions are pruned out, This
>> > optimizes a significant amount of Planner and SQL execution time,
>> > especially if the partitioned table has a large number of
>> > subpartitions and is what I want.
>> >
>> > Currently, There were two SQL failures in the regression test,
>> > because the expansion order of sublink was changed, which did not
>> > affect the execution result of SQL.
>> >
>> > Look forward to your suggestions on this proposal.
>> >
>>
>> I took a quick look, and while I don't see / can't think of any problems
>> with delaying it until after generating implied equalities, there seems
>> to be a number of gaps.
>>
>> *Thank you for your attention.*
>>
>> 1) Are there any regression tests exercising this modified behavior?
>> Maybe there are, but if the only changes are due to change in order of
>> targetlist entries, that doesn't seem like a clear proof.
>>
>> It'd be good to add a couple tests exercising both the positive and
>> negative case (i.e. when we can and can't pushdown a qual).
>>
>> *I added several samples to the regress(qual_pushdown_to_sublink.sql). *
>> *and I
>> used the partition table to show the plan status of qual being pushed down 
>> into sublink.*
>>
>> *Hopefully this will help you understand the details of this patch. Later, I 
>> will add more cases.*
>>
>> 2) apparently, contrib/postgres_fdw does crash like this:
>>
>>   #3  0x000000000077b412 in adjust_appendrel_attrs_mutator
>>       (node=0x13f7ea0, context=0x7fffc3351b30) at appendinfo.c:470
>>   470          Assert(!IsA(node, SubLink));
>>   (gdb) p node
>>   $1 = (Node *) 0x13f7ea0
>>   (gdb) p *node
>>   $2 = {type = T_SubLink}
>>
>>   Backtrace attached.
>>
>> *For the patch attached in the last email, I passed all the tests under
>> src/test/regress.*
>> *As you pointed out, there was a problem with regression under contrib(in
>> contrib/postgres_fdw). *
>> *This time I fixed it and the current patch (V2) can pass the
>> check-world.*
>>
>> 3) various parts of the patch really need at least some comments, like:
>>
>>   - try_push_outer_qual_to_sublink_query really needs some docs
>>
>>   - new stuff at the end of initsplan.c
>>
>> *Ok, I added some comments and will
>> add more. If you have questions about any details,*
>> *please point them out directly.*
>>
>> 4) generate_base_implied_equalities
>>
>>    shouldn't this
>>
>>         if (ec->ec_processed)
>>             ;
>>
>>    really be?
>>
>>         if (ec->ec_processed)
>>             continue;
>>
>> *You are right. I fixed it.*
>>
>> 5) I'm not sure why we need the new ec_processed flag.
>>
>>
>> *I did this to eliminate duplicate equalities from the two 
>> generate_base_implied_equalities calls*
>>
>> *1) I need the base equivalent expression generated after 
>> generate_base_implied_equalities,*
>> *which is used to pushdown qual to sublink(lazy_process_sublinks)*
>>
>> *2) The expansion of sublink may result in an equivalent expression with 
>> parameters, such as a = $1,*
>> *which needs to deal with the equivalence classes again.*
>>
>> *3) So, I added ec_processed and asked to process it again 
>> (generate_base_implied_equalities)*
>> *after the equivalence class changed (add_eq_member/process_equivalence).*
>>
>> *Maybe you have a better suggestion, please let me know.*
>>
>> 6) So we now have lazy_process_sublink callback? Does that mean we
>> expand sublinks in two places - sometimes lazily, sometimes not?
>>
>> *Yes, not all sublink is delayed. Let me explain this:*
>> *1) I added a GUC switch enable_lazy_process_sublink. If it is turned off, 
>> all
>> lazy process sublink will not happen,*
>>
>> *qual pushdown to sublink depend on lazy procee sublink, which means no 
>> quals will be pushed down.*
>> *2) Even  if enable_lazy_process_sublink = true
>> If Query in this level contains some complex features,*
>> *sublink in this level query will not try do qual pushdown. (see function
>> query_has_sublink_try_pushdown_qual).*
>>
>> *I want to support a minimum subset first. Then consider complex features 
>> such as CTE/DML.*
>> *3) Finally, under conditions 1 and 2,
>> all kinds of sublink contained in the SELECT clause or*
>> *WHERE clause will delays expansion and try pushdown qual. The
>> sublink elsewhere in the SQL statement*
>> *does not delay process.*
>>
>> *The current status meets my requirements for
>> now. Of course, after this scheme is proved to be feasible, maybe*
>> *we can discuss that all sublinks are processed by overall delay, just like
>> qual pushdown to subquery.*
>>
>> *thanks*
>>
>> *Wenjing*
>>
>>
>>
>> regards
>>
>> --
>> Tomas Vondra
>> EnterpriseDB: http://www.enterprisedb.com
>> The Enterprise PostgreSQL Company
>>
> Hi,

+       /* The outer var could exist in any of the upper-level queries to
find these roots */

to find these roots -> so find these roots

+           if (has_unexpand_sublink(root) && checkExprHasSubLink(node))

has_unexpand_sublink -> has_unexpanded_sublink

+   if (enable_lazy_process_sublink)
+       return true;

The above can be simplified to:

  return enable_lazy_process_sublink;

+           if (checkExprHasSubLink(qual))
+           {
+               qual = lazy_process_sublink_qual(root, qual);
+               newquals = lappend(newquals, qual);
+           }
+           else
+               newquals = lappend(newquals, qual);

Since the lappend() is common to both branches, you can remove the else
clause. In the if block, only call lazy_process_sublink_qual().

+       /* under lazy process sublink, parent root may have some data that
child not need, so set it to NULL */
+       subroot->join_info_list = NIL;

minor correction to the comment above:
  under lazy process sublink, parent root may have some data that child
does not need, so set it to NIL

+void
+preprocess_qual_conditions(PlannerInfo *root, Node *jtnode, bool istop)

Please add a comment explaining the meaning of istop.

+       if (istop)
+           f->quals = preprocess_expression_ext(root, f->quals,
EXPRKIND_QUAL, false);
+       else
+           f->quals = preprocess_expression(root, f->quals, EXPRKIND_QUAL);

I think the code would be more readable if you replace
the preprocess_expression() call in else branch with call
to preprocess_expression_ext().

+           context->root->unexpand_sublink_counter++;

unexpand_sublink_counter -> unexpanded_sublink_counter++

For sublink_query_push_qual(), the return at the end is not needed.

For condition_is_safe_pushdown_to_sublink, you can initialize context this
way :

+   equal_expr_info_context context = {0};

+   if (cvar && cvar->varattno > 0 && equal(cvar, var))
+       return true;

The last few lines of condition_is_safe_pushdown_to_sublink() can be
written as:

  return cvar && cvar->varattno > 0 && equal(cvar, var);

+       if (equal_expr_safety_check(node, &context))
+       {
+           /* It needs to be something like outer var = inner var */
+           if (context.inner_var &&

The nested if blocks can be merged into one if block.

Cheers

Reply via email to