On Thu, Jul 20, 2017 at 5:30 PM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Thu, Jul 20, 2017 at 11:47 PM, Ashutosh Bapat > <ashutosh.ba...@enterprisedb.com> wrote: >> On Thu, Jul 20, 2017 at 11:04 AM, Thomas Munro >> <thomas.mu...@enterprisedb.com> wrote: >>> On Fri, Jun 2, 2017 at 4:16 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >>>> I don't think it does really. The thing about a <> semijoin is that it >>>> will succeed unless *every* join key value from the inner query is equal >>>> to the outer key value (or is null). That's something we should consider >>>> to be of very low probability typically, so that the <> selectivity should >>>> be estimated as nearly 1.0. If the regular equality selectivity >>>> approaches 1.0, or when there are expected to be very few rows out of the >>>> inner query, then maybe the <> estimate should start to drop off from 1.0, >>>> but it surely doesn't move linearly with the equality selectivity. >>> >>> Ok, here I go like a bull in a china shop: please find attached a >>> draft patch. Is this getting warmer? >>> >>> In the comment for JOIN_SEMI I mentioned a couple of refinements I >>> thought of but my intuition was that we don't go for such sensitive >>> and discontinuous treatment of stats; so I made the simplifying >>> assumption that RHS always has more than 1 distinct value in it. >>> >>> Anti-join <> returns all the nulls from the LHS, and then it only >>> returns other LHS rows if there is exactly one distinct non-null value >>> in RHS and it happens to be that one. But if we make the same >>> assumption I described above, namely that there are always at least 2 >>> distinct values on the RHS, then the join selectivity is just >>> nullfrac. >>> >> >> The patch looks good to me. >> >> + /* >> + * For semi-joins, if there is more than one distinct key in the RHS >> + * relation then every non-null LHS row must find a match since it >> can >> + * only be equal to one of them. >> The word "match" confusing. Google's dictionary entry gives "be equal >> to (something) in quality or strength." as its meaning. May be we want >> to reword it as "... LHS row must find a joining row in RHS ..."? > > Thanks! Yeah, here's a version with better comments.
Thanks. Your version is better than mine. > > Does anyone know how to test a situation where the join is reversed according > to > get_join_variables, or "complicated cases where we can't tell for sure"? > explain select * from pg_class c right join pg_type t on (c.reltype = t.oid); would end up with *join_is_reversed = true; Is that what you want? For a semi-join however I don't know how to induce that. AFAIU, in a semi-join there is only one direction in which join can be specified. I didn't get the part about "complicated cases where we can't tell for sure". -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers