Re: Built-in CTYPE provider

Noah Misch Sat, 29 Jun 2024 15:09:20 -0700

On Wed, Mar 20, 2024 at 05:13:26PM -0700, Jeff Davis wrote:
> On Tue, 2024-03-19 at 13:41 +0100, Peter Eisentraut wrote:
> > * v25-0002-Support-C.UTF-8-locale-in-the-new-builtin-collat.patch
> > 
> > Looks ok.
> 
> Committed.


>       <varlistentry>
> +      <term><literal>pg_c_utf8</literal></term>
> +      <listitem>
> +       <para>
> +        This collation sorts by Unicode code point values rather than natural
> +        language order.  For the functions <function>lower</function>,
> +        <function>initcap</function>, and <function>upper</function>, it uses
> +        Unicode simple case mapping.  For pattern matching (including regular
> +        expressions), it uses the POSIX Compatible variant of Unicode <ulink
> +        
> url="https://www.unicode.org/reports/tr18/#Compatibility_Properties";>Compatibility
> +        Properties</ulink>.  Behavior is efficient and stable within a
> +        <productname>Postgres</productname> major version.  This collation is
> +        only available for encoding <literal>UTF8</literal>.
> +       </para>
> +      </listitem>
> +     </varlistentry>

lower(), initcap(), upper(), and regexp_matches() are PROVOLATILE_IMMUTABLE.
Until now, we've delegated that responsibility to the user.  The user is
supposed to somehow never update libc or ICU in a way that changes outcomes
from these functions.  Now that postgresql.org is taking that responsibility
for builtin C.UTF-8, how should we govern it?  I think the above text and [1]
convey that we'll update the Unicode data between major versions, making
functions like lower() effectively STABLE.  Is that right?

(This thread had some discussion[2] that datcollversion/collversion won't
necessarily change when a major versions changes lower() behavior.)

[1] 
https://postgr.es/m/7089acb3ebac0c1682a79c8bc16803cf06896fb9.ca...@j-davis.com
[2] 
https://postgr.es/m/5a1ecc40539f36cac5b27a62739a45a49785ca54.ca...@j-davis.com

Re: Built-in CTYPE provider

Reply via email to