On Mon, Jun 19, 2023, at 11:21, Tomas Vondra wrote: > AFAICS the standard only defines arrays and multisets. Arrays are pretty > much the thing we have, including the ARRAY[] constructor etc. Multisets > are similar to hashset discussed here, except that it tracks the number > of elements for each value (which would be trivial in hashset). > > So if we want to make this a built-in feature, maybe we should aim to do > the multiset thing, with the standard SQL syntax? Extending the grammar > should not be hard, I think. I'm not sure of the underlying code > (ArrayType, ARRAY_SUBLINK stuff, etc.) we could reuse or if we'd need a > lot of separate code doing that.
Multisets handle duplicates uniquely, this may bring unexpected issues. Sets and multisets have distinct utility in C++, Rust, Java, etc. However, sets are more fundamental and prevalent in std libs than multisets. Despite SQL's multiset possibility, a distinct hashset type is my preference, helping appropriate data structure choice and reducing misuse. The necessity of multisets is vague beyond standards compliance. /Joel