On Wed, Feb 26, 2025 at 4:46 AM Richard Guo <guofengli...@gmail.com> wrote: > I agree that it'd be beneficial to make some improvements to NOT IN > subqueries. From what I can see, we may have two potential options: > > * As Tom mentioned, we can prove that the subquery's output never > contains NULL values and then convert the NOT IN into an anti-join. > (It seems to me that we would also need to prove that the outer side > never contains NULL values either, because whether the NULL values > from the outer side should be included in the output depends on > whether the inner side is empty.) > > * We can add support in the executor to handle the NULL semantics of > NOT IN. This may require inventing a new join type. > > I'm not quite sure which option is more promising at the moment, or if > there are other options to consider.
I'm not quite sure, either. I think that the first option (proving that there can be no NULL values) is probably more similar to thingswe've done elsewhere in the planner, so I think I have been assuming that if we did something about this, it would be that. However, I have also had the idea of extending the executor to handle this situation, and it's possible that option is more promising. I don't really know. I think it's pretty common to have NOT IN clauses where one can be certain that there definitely isn't a NULL present, but unfortunately I think it's also pretty common to have cases where a NULL could be present, or at least, where one cannot provide that no NULL can be present. -- Robert Haas EDB: http://www.enterprisedb.com