On 10.01.22 12:49, Daniel Verite wrote:
With the current patch, it's not possible, AFAICS, because the user can't tell that the collation is non-deterministic. Presumably this would require another option to CREATE DATABASE and another column to store that bit of information.
Adding this would be easy, but since pattern matching currently does not support nondeterministic collations, if you make a global collation nondeterministic, a lot of system views, psql, pg_dump queries etc. would break, so it's not practical. I view this is an orthogonal project. Once we can support this without breaking system views etc., then it's easy to enable with a new column in pg_database.
The "daticucol" column also suggests that we don't expect to add other collation providers in the future. Maybe a pair of columns like (datcollprovider, datcolllocale) would be more future-proof, or a (datcollprovider, datcolllocale, datcollisdeterministic) triplet if non-deterministic collations are allowed.
I don't expect many new collation providers. So I don't think an EAV-like storage would be helpful. The other problem is that we don't know what we need. For example, the libc provider needs both a collate and a ctype value, so that wouldn't fit into that scheme nicely.
Also, pg_collation has "collversion" to detect a mismatch between the ICU runtime and existing indexes. I don't see that field for the db-wide ICU collation, so maybe we currently miss the capability to detect the mismatch for the db-wide collation?
Yeah, I think I need to add a datcollversion field and the associated checks.