On Mon, Jun 19, 2023 at 2:51 PM Joel Jacobson <j...@compiler.org> wrote: > > On Mon, Jun 19, 2023, at 02:00, jian he wrote: > > select hashset_contains('{1,2}'::int4hashset,NULL::int); > > should return null? > > Hmm, that's a good philosophical question. > > I notice Tomas Vondra in the initial commit opted for allowing NULL inputs, > treating them as empty sets, e.g. in int4hashset_add() we create a > new hashset if the first argument is NULL. > > I guess the easiest perhaps most consistent NULL-handling strategy > would be to just mark all relevant functions STRICT except for the agg ones > since we probably want to allow skipping over rows with NULL values > without the entire result becoming NULL. > > But if we're not just going the STRICT route, then I think it's a bit more tricky, > since you could argue the hashset_contains() example should return FALSE > since the set doesn't contain the NULL value, but OTOH, since we don't > store NULL values, we don't know if has ever been added, hence a NULL > result would perhaps make more sense. > > I think I lean on thinking that if we want to be "NULL-friendly", like we > currently are in hashset_add(), it would probably be most user-friendly > to be consistent and let all functions return non-null return values in > all cases where it is not unreasonable. > > Since we're essentially designing a set-theoretic system, I think we should > aim for the logical "soundness" property of it and think about how we can > verify that it is. > > Thoughts? > > /Joel
hashset_to_array function should be strict? I noticed hashset_symmetric_difference and hashset_difference handle null in a different way, seems they should handle null in a consistent way? select '{1,2,NULL}'::int[] operator (pg_catalog.@>) '{NULL}'::int[]; --false select '{1,2,NULL}'::int[] operator (pg_catalog.&&) '{NULL}'::int[]; --false. So similarly I guess hashset_contains should be false. select hashset_contains('{1,2}'::int4hashset,NULL::int);