On Fri, Mar 17, 2017 at 1:54 AM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > SELECT * > FROM lineitem l1 > WHERE EXISTS (SELECT * > FROM lineitem l2 > WHERE l1.l_orderkey = l2.l_orderkey); > > -> estimates 59986012 rows, actual rows 59,986,052 (scale 10 TPCH) > > SELECT * > FROM lineitem l1 > WHERE EXISTS (SELECT * > FROM lineitem l2 > WHERE l1.l_orderkey = l2.l_orderkey > AND l1.l_suppkey <> l2.l_suppkey); > > -> estimates 1 row, actual rows 57,842,090 (scale 10 TPCH)
The relevant code is in neqsel(). It estimates the fraction of rows that will be equal, and then does 1 - that number. Evidently, the query planner thinks that l1.l_suppkey = l2.l_suppkey would almost always be true, and therefore l1.l_suppkey <> l2.l_suppkey will almost always be false. I think the presumed selectivity of l1.l_suppkey = l2.l_suppkey is being computed by var_eq_non_const(), but I'm a little puzzled by that function is managing to produce a selectivity estimate of, essentially, 1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers