On Sun, 3 Mar 2019 at 05:25, Tom Lane <t...@sss.pgh.pa.us> wrote: > Hmm ... thinking about the strictness angle some more: what we really > need to optimize NOT IN, IIUC, is an assumption that the join operator > will never return NULL. While not having NULL inputs is certainly a > *necessary* condition for that (assuming a strict operator) it's not a > *sufficient* condition. Any Postgres function/operator is capable > of returning NULL whenever it feels like it. So checking strictness > does not lead to a mathematically correct optimization.
That's something I didn't think of. > My initial thought about plugging that admittedly-academic point is > to insist that the join operator be both strict and a member of a > btree opclass (hash might be OK too; less sure about other index types). > The system already contains assumptions that btree comparators never > return NULL. I doubt that this costs us any real-world applicability, > because if the join operator can neither merge nor hash, we're screwed > anyway for finding a join plan that's better than nested-loop. Why strict? If both inputs are non-NULL, then what additional guarantees does strict give us? I implemented a btree opfamily check in my version of the patch. Not so sure about hash, can you point me in the direction of a mention of how this is guarantees for btree? The attached v1.2 does this adds a regression test using the LINE type. This has an operator named '=', but no btree opfamily. A few other types are in this boat too, per: select typname from pg_type t where not exists(select 1 from pg_amop where amoplefttype = t.oid and amopmethod=403) and exists (select 1 from pg_operator where oprleft = t.oid and oprname = '='); The list of builtin types that have a hash opfamily but no btree opfamily that support NOT IN are not very exciting, so doing the same for hash might not be worth the extra code. select typname from pg_type t where exists(select 1 from pg_amop where amoplefttype = t.oid and amopmethod=405) and exists (select 1 from pg_operator where oprleft = t.oid and oprname = '=') and not exists(select 1 from pg_amop where amoplefttype = t.oid and amopmethod=403); typname --------- xid cid aclitem (3 rows) -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
not_in_anti_join_v1.2.patch
Description: Binary data