On Fri, Apr 24, 2020 at 8:10 PM Antonin Houska <a...@cybertec.at> wrote:
> Andy Fan <zhihui.fan1...@gmail.com> wrote: > > > The more tests on your patch, the more powerful I feel it is! > > Thanks for the appreciation. Given the poor progress it's increasingly hard > for me to find motivation to work on it. I'll try to be more professional > :-) > > Let's not give up:) I see your patch can push down the aggregation in 3 cases at least: 1). The group by clause exists in the join eq clause. 2). The group by clause doesn't exist in join eq clause, but we have pk on on side of join eq clause. 3). The group by clause doesn't exist in join eq clause, and we don't have the pk as well. Tom well explained the correctness of the first 2 cases [1] and probably the case 3) is correct as well, but it is a bit of hard to understand. If this is a case for others as well, that probably make the review harder. So my little suggestion is can we split the patch into some smaller commit to handle each case? like: commit 1 & 2 handles case 1 & 2,commit 3 handles join relation as well. commit 4 handles the case 3. Commit 5 can avoid the two-phase aggregation for case 2. Commit 6 introduces the aggmultifn. and commit 7 to handle some outer join case Ashutosh raised at [2]. However not all the cases need to be handled at one time. Actually when I read the code, I want such separation by my own to verify my understanding is correct, but I don't have such time at least now. It maybe much easier for you since you are the author. I will be pretty pleasure to help in any case after I close my current working item (Hopefully in 2 months). > At the code level, I did some slight changes on init_grouping_targets > which may > > make the code easier to read. You are free to to use/not use it. > > I'm going to accept your change of create_rel_agg_info(), but I hesitate > about > the changes to init_grouping_targets(). > > First, is it worth to spend CPU cycles on construction of an extra list > grouping_columns? Is there a corner case in which we cannot simply pass > grouping_columns=target->exprs to check_functional_grouping()? > > Second, it's obvious that you prefer the style > > foreach () > { > if () > ... > else if () > ... > else > ... > } > > over this > > foreach () > { > if () > { > ... > continue; > } > > if () > { > ... > continue; > } > > ... > } > > I often prefer the latter and I see that the existing planner code uses > this > style quite often too. I think the reason is that it allows for more > complex > tests, while the "else-if style" requires all tests to take place inside > the > "if ()" expression. However, if several (not necessarily tightly related) > tests become "compressed" this way, it's less obvious how where to put > comments. Indeed it seems that some comments got lost due to your changes. Your explanation looks reasonable, thanks for that. the changes also used some builtin function to avoid the one more level for-loop. like tlist_member. As for the high level design, based on my current knowledge, probably no need to change. [1] https://www.postgresql.org/message-id/9726.1542577439%40sss.pgh.pa.us [2] https://www.postgresql.org/message-id/flat/CAFjFpRdpeMTd8kYbM_x0769V-aEKst5Nkg3%2BcoG%3D8ki7s8Zqjw%40mail.gmail.com