On Tue, Jun 20, 2023, at 14:10, Tomas Vondra wrote: > On 6/20/23 12:59, Joel Jacobson wrote: >> On Mon, Jun 19, 2023, at 02:00, jian he wrote: >>> select hashset_contains('{1,2}'::int4hashset,NULL::int); >>> should return null? >> >> I agree, it should. >> >> I've now changed all functions except int4hashset() (the init function) >> and the aggregate functions to be STRICT. > > I don't think this is correct / consistent with what we do elsewhere. > IMHO it's perfectly fine to have a hashset containing a NULL value,
The reference to consistency with what we do elsewhere might not be entirely applicable in this context, since the set feature we're designing is a new beast in the SQL landscape. I think adhering to the theoretical purity of sets by excluding NULLs aligns us with set theory, simplifies our code, and parallels set implementations in other languages. I think we have an opportunity here to innovate and potentially influence a future set concept in the SQL standard. However, I see how one could argue against this reasoning, on the basis that PostgreSQL users might be more familiar with and expect NULLs can exist everywhere in all data structures. A different perspective is to look at what use-cases we can foresee. I've been trying hard, but I can't find compelling use-cases where a NULL element in a set would offer a more natural SQL query than handling NULLs within SQL and keeping the set NULL-free. Does anyone else have a strong realistic example where including NULLs in the set would simplify the SQL query? /Joel