On Thu, Sep 28, 2017 at 8:57 PM, Alexander Kuzmenkov <a.kuzmen...@postgrespro.ru> wrote: > Hi Ashutosh, > > Thanks for the review. > > Jeff, I'm copying you because this is relevant to our discussion about what > to do with mergeopfamilies when adding new merge join types. > > You have renamed RestrictInfo member mergeopfamilies as > equivopfamilies. I don't think that's a good name; it doesn't convey > that these are opfamilies containing merge operators. The changes in > check_mergejoinable() suggest that an operator may act as equality > operator in few operator families and comparison operator in others. > That looks odd. Actually an operator family contains operators other > than equality operators, so you may want to retain this member and add > a new member to specify whether the clause is an equality clause or > not. > > > For mergeopfamilies, I'm not sure what is the best thing to do. I'll try to > explain my understanding of the situation, please correct me if I'm wrong. > > Before the patch, mergeopfamilies was used for two things: creating > equivalence classes and performing merge joins. > > For equivalence classes: we look at the restriction clauses, and if they > have mergeopfamilies set, it means that these clause are based on an > equality operator, and the left and right variables must be equal. To record > this fact, we create an equivalence class. The variables might be equal for > one equality operator and not equal for another, so we record the particular > operator families to which our equality operator belongs. > > For merge join: we look at the join clauses, and if they have > mergeopfamilies set, it means that these clauses are based on an equality > operator, and we can try performing this particular join as merge join. > These opfamilies are also used beforehand to create the equivalence classes > for left and right variables. The equivalence classes are used to match the > join clauses to pathkeys describing the ordering of join inputs. > > So, if we want to start doing merge joins for operators other than equality, > we still need to record their opfamilies, but only use them for the second > case and not the first. I chose to put these opfamilies to different > variables, and > name the one used for equivalence classes 'equivopfamilies' and the one used > for merge joins 'mergeopfamilies'. The equality operators are used for both > cases, so we put their opfamilies into both of these variables. > > I agree this might look confusing. Indeed, we could keep a single variable > for opfamilies, and add separate flags that show how they can be used, be > that for equivalence classes, merge joins, range joins or some combination > of them. This is similar to what Jeff did in his range merge join patch [1]. > I will think more about this and try to produce an updated patch. >
I think we have (ab?)used mergeopfamilies to indicate equality condition, which needs some changes. May be these two patches are where we can do those changes. > > In mergejoinscansel() you have just removed Assert(op_strategy == > BTEqualStrategyNumber); Probably this function is written considering > on equality operators. But now that we are using this for all other > operators, we will need more changes to this function. That may be the > reason why INNER join in your earlier example doesn't choose right > costing. > > > I changed mergejoinscansel() slightly to reflect the fact that the inner > relation is scanned from the beginning if we have an inequality merge > clause. > > > The comment change in final_cost_mergejoin() needs more work. n1, n2, > n3 are number of rows on inner side with values 1, 2, 3 resp. So n1 + > n2 + n3 + ... = size of inner relation is correct. In that context I > am not able to understand your change > + * If the merge clauses contain inequality, (n1 + n2 + ...) ~= > + * (size of inner relation)^2. > > > I extended the comment in final_cost_mergejoin(). Not sure if that > approximation makes any sense, but this is the best I could think of. > > Style problems are fixed. > > Attached please find the new version of the patch that addresses all the > review comments except mergeopfamilies. > > The current commitfest is ending, but I'd like to continue working on this > patch, so I am moving it to the next one. > > Thanks for working on the comments. I am interested to continue reviewing it in the next commitfest. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers