Re: [HACKERS] Can ICU be used for a database's default sort order?

Daniel Verite Wed, 12 Dec 2018 06:58:36 -0800

        Peter Eisentraut wrote:

> Another issue is that we'd need to carefully divide up the role of the
> "default" collation and the "default" provider.  The default collation
> is the collation defined for the database, the default provider means to
> use the libc non-locale_t enabled API functions.  Right now these are
> always the same, but if the database-global locale is ICU, then the
> default collation would use the ICU provider.


I think one related issue that the patch works around by using a libc locale
as a proxy is knowing what to put into libc's LC_CTYPE and LC_COLLATE.
In fact I've been wondering if that's the main reason for the interface
implemented by the patch.

Otherwise, how should these env variables be initialized for ICU
databases?
For instance in the existing FTS code, lowerstr_with_len() in
tsearch/ts_locale.c calls tolower() or towlower() to fold a string to
lower case when normalizing lexemes. This requires LC_CTYPE to be set
to something compatible with the database encoding, at the very
least. Even if that code looks like it might need to be changed for
ICU anyway (or just to be collation-aware according to the TODO marks?),
what about comparable calls in extensions?

In the case that we don't touch libc's LC_COLLATE/LC_CTYPE in backends,
extension code would have them inherited from the postmaster? Does that
sound acceptable? If not, maybe ICU databases should have these as
settable options, in addition to their ICU locale?


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

Re: [HACKERS] Can ICU be used for a database's default sort order?

Reply via email to