HI, On Oct 26, 2022, 06:09 +0800, Tom Lane <t...@sss.pgh.pa.us>, wrote: > While fooling with my longstanding outer-join variables changes > (I am making progress on that, honest), I happened to notice that > equivclass.c is leaving some money on the table by generating > redundant RestrictInfo clauses. It already attempts to not generate > the same clause twice, which can save a nontrivial amount of work > because we cache selectivity estimates and so on per-RestrictInfo. > I realized though that it will build and return a clause like > "a.x = b.y" even if it already has "b.y = a.x". This is just > wasteful. It's always been the case that equivclass.c will > produce clauses that are ordered according to its own whims. > Consumers that need the operands in a specific order, such as > index scans or hash joins, are required to commute the clause > to be the way they want it while building the finished plan. > Therefore, it shouldn't matter which order of the operands we > return, and giving back the commutator clause if available could > potentially save as much as half of the selectivity-estimation > work we do with these clauses subsequently. > > Hence, PFA a patch that adjusts create_join_clause() to notice > commuted as well as exact matches among the EquivalenceClass's > existing clauses. This results in a number of changes visible in > regression test cases, but they're all clearly inconsequential. > > The only thing that I think might be controversial here is that > I dropped the check for matching operator OID. To preserve that, > we'd have needed to use get_commutator() in the reverse-match cases, > which it seemed to me would be a completely unjustified expenditure > of cycles. The operators we select for freshly-generated clauses > will certainly always match those of previously-generated clauses. > Maybe there's a chance that they'd not match those of ec_sources > clauses (that is, the user-written clauses we started from), but > if they don't and that actually makes any difference then surely > we are talking about a buggy opclass definition. > > I've not bothered to make any performance tests to see if there's > actually an easily measurable gain here. Saving some duplicative > selectivity estimates could be down in the noise ... but it's > surely worth the tiny number of extra tests added here. > > Comments? > > regards, tom lane > Make sense.
How about combine ec->ec_sources and ec->derives as one list for less codes? ``` foreach(lc, list_union(ec->ec_sources, ec->ec_derives)) { rinfo = (RestrictInfo *) lfirst(lc); if (rinfo->left_em == leftem && rinfo->right_em == rightem && rinfo->parent_ec == parent_ec) return rinfo; if (rinfo->left_em == rightem && rinfo->right_em == leftem && rinfo->parent_ec == parent_ec) return rinfo; } ``` I have a try, it will change some in join.out and avoid changes in tidscan.out. Regards, Zhang Mingli >