On Fri, 2023-06-16 at 16:50 +0200, Peter Eisentraut wrote:
> This looks good to me.
>
> Attached is small fixup patch with some documentation tweaks and
> simplifying some test code (also includes pgperltidy).
Thank you. Committed with your fixups.
Regards,
Jeff Davis
On 14.06.23 23:24, Jeff Davis wrote:
On Mon, 2023-06-12 at 23:04 +0200, Peter Eisentraut wrote:
Patch 0003:
Makes LOCALE apply to all providers. The overall feel after this
patch
is that "locale" now means the collation locale, and
LC_COLLATE/LC_CTYPE are for the server environment. When using
On Mon, 2023-06-12 at 23:04 +0200, Peter Eisentraut wrote:
> I object to adding a new provider for PG16 (patch 0001).
Added to July CF for 17.
> > 2. Patch 0004 is possibly out of scope for 16
> Also clearly a new feature.
Added to July CF for 17.
Regards,
Jeff Davis
On 09.06.23 02:36, Jeff Davis wrote:
Patches 0001, 0002:
These patches implement the built-in provider and automatically change
provider=icu to provider=builtin when the locale is C.
I object to adding a new provider for PG16 (patch 0001). This is
clearly a new feature, which wasn't even con
Jeff Davis wrote:
> I guess where I'm confused is: why would a user actually want their
> database collation to be C.UTF-8? It's slower than C, our
> implementation doesn't properly version it (as you pointed out), and
> the semantics don't seem great ('Z' < 'a').
Because when LC_CTYPE=C,
On Fri, 2023-06-09 at 14:12 +0200, Daniel Verite wrote:
> > I implemented a compromise where initdb will
> > change C.UTF-8 to the built-in provider
>
> $ initdb --locale=C.UTF-8
...
> This setup is not what the user has asked for and leads to that kind
> of
> wrong results:
>
> $ psql -c "s
Jeff Davis wrote:
> I implemented a compromise where initdb will
> change C.UTF-8 to the built-in provider
This handling of C.UTF-8 would be felt by users as simply broken.
With the v10 patches:
$ initdb --locale=C.UTF-8
initdb: using locale provider "builtin" for ICU locale "C.UTF-
On 6/8/23 17:15, Jeff Davis wrote:
On Wed, 2023-06-07 at 20:52 -0400, Joe Conway wrote:
If the provider has no such thing, throw an error.
Just to be clear, that implies that users (and buildfarm members) with
LANG=C.UTF-8 in their environment would not be able to run a plain
"initdb -D data";
On Wed, 2023-06-07 at 20:52 -0400, Joe Conway wrote:
> If the provider has no such thing, throw an error.
Just to be clear, that implies that users (and buildfarm members) with
LANG=C.UTF-8 in their environment would not be able to run a plain
"initdb -D data"; they'd get an error. It's hard for m
Jeff Davis wrote:
> As I replied in that subthread, that creates a worse problem: if you
> only change the provider when the locale is C, then what about when the
> locale is *not* C?
>
> export LANG=en_US.UTF-8
> initdb -D data --locale=fr_FR.UTF-8
> ...
>provider:icu
>ICU
Tatsuo Ishii wrote:
> >> Yes it's a special case but when doing initdb --locale=C, a user does
> >> not need or want an ICU locale. They want the same thing than what v15
> >> does with the same arguments: a template0 database with
> >> datlocprovider='c', datcollate='C', datctype='C', dat
>> As I replied in that subthread, that creates a worse problem: if you
>> only change the provider when the locale is C, then what about when the
>> locale is *not* C?
>>
>> export LANG=en_US.UTF-8
>> initdb -D data --locale=fr_FR.UTF-8
>> ...
>> provider:icu
>> ICU locale: en-
On 6/7/23 19:26, Jeff Davis wrote:
* What do we do in the case where the environment has LANG=C.UTF-8 (as
some buildfarm members do)? Is an error acceptable in that case?
If I understand the discussion so far correctly, I think that case
should fall to the provider.
If it supports "C.UTF-8"
Hi,
> On Wed, 2023-06-07 at 23:50 +0200, Daniel Verite wrote:
>> The simplest way to obtain that in v16 is to teach initdb that
>> --locale=C without the --locale-provider option implies that
>> --locale-provider=libc ([1])
>
> As I replied in that subthread, that creates a worse problem: if you
On Thu, 2023-06-08 at 00:11 +0200, Peter Eisentraut wrote:
> On 05.06.23 19:54, Jeff Davis wrote:
> > New patch series attached.
>
> Could you clarify what here is intended for 16 and what is for later?
I apologize about the patch churn here. I implemented several
approaches to see what feedback
On Wed, 2023-06-07 at 23:50 +0200, Daniel Verite wrote:
> The simplest way to obtain that in v16 is to teach initdb that
> --locale=C without the --locale-provider option implies that
> --locale-provider=libc ([1])
As I replied in that subthread, that creates a worse problem: if you
only change th
On 05.06.23 19:54, Jeff Davis wrote:
New patch series attached.
Could you clarify what here is intended for 16 and what is for later?
This patch set keeps expanding and changing in each iteration.
There is a PG16 open item linked to this thread
* The rules for choosing default ICU locale se
On 05.06.23 19:54, Jeff Davis wrote:
New patch series attached. I plan to commit 0001 and 0002 soon, unless
there are objections.
0001 causes the "C" and "POSIX" locales to be treated with
memcmp/pg_ascii semantics in ICU, just like in libc. We also considered
a new "none" provider, but it's mo
Jeff Davis wrote:
> The locale "C" is a special case, documented as a non-locale. So, if
> LOCALE/--locale apply to ICU, then either ICU needs to handle locale
> "C" in the expected way (v8 patch series); or when we see locale "C" we
> need to somehow change the provider into something tha
On 22.05.23 19:35, Jeff Davis wrote:
On Thu, 2023-05-11 at 13:07 +0200, Peter Eisentraut wrote:
Here is my proposed patch for this.
The commit message makes it sound like lc_collate/ctype are completely
obsolete, and I don't think that's quite right: they still represent
the server environment
> "Joe" == Joe Conway writes:
> On 6/6/23 15:55, Tom Lane wrote:
>> Robert Haas writes:
>>> On Tue, Jun 6, 2023 at 3:25 PM Tom Lane wrote:
Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
"Jonathan S. Katz" writes:
> Since we're bikeshedding, "postgresql" or "builtin" could make it seem
> to a (app) developer that these may be recommended options, as we're
> trusting PostgreSQL to make the best choices for us. Granted, v16 is
> (theoretically) defaulting to ICU, so that choice i
On 6/6/23 3:56 PM, Joe Conway wrote:
On 6/6/23 15:55, Tom Lane wrote:
Robert Haas writes:
On Tue, Jun 6, 2023 at 3:25 PM Tom Lane wrote:
Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's su
On 6/6/23 15:55, Tom Lane wrote:
Robert Haas writes:
On Tue, Jun 6, 2023 at 3:25 PM Tom Lane wrote:
Also +1, except that I find "none" a rather confusing choice of name.
There *is* a provider, it's just PG itself not either libc or ICU.
I thought Joe's suggestion of "internal" made more sense
Robert Haas writes:
> On Tue, Jun 6, 2023 at 3:25 PM Tom Lane wrote:
>> Also +1, except that I find "none" a rather confusing choice of name.
>> There *is* a provider, it's just PG itself not either libc or ICU.
>> I thought Joe's suggestion of "internal" made more sense.
> Or perhaps "builtin"
On Tue, Jun 6, 2023 at 3:25 PM Tom Lane wrote:
> Joe Conway writes:
> > On 6/6/23 15:18, Jeff Davis wrote:
> >> The locale "C" is a special case, documented as a non-locale. So, if
> >> LOCALE/--locale apply to ICU, then either ICU needs to handle locale
> >> "C" in the expected way (v8 patch ser
Joe Conway writes:
> On 6/6/23 15:18, Jeff Davis wrote:
>> The locale "C" is a special case, documented as a non-locale. So, if
>> LOCALE/--locale apply to ICU, then either ICU needs to handle locale
>> "C" in the expected way (v8 patch series); or when we see locale "C" we
>> need to somehow chan
On 6/6/23 15:18, Jeff Davis wrote:
On Tue, 2023-06-06 at 15:09 +0200, Daniel Verite wrote:
FWIW I don't quite see how 0001 improve things or what problem it's
trying to solve.
The word "locale" is generic, so we need to make LOCALE/--locale apply
to whatever provider is being used. If "locale"
On 6/6/23 15:15, Jeff Davis wrote:
On Tue, 2023-06-06 at 14:11 -0400, Joe Conway wrote:
This discussion makes me wonder (though probably too late for the v16
cycle) if we shouldn't treat "C" and "POSIX" locales to be a third
provider, something like "internal".
That's exactly what I did in v6
On Tue, 2023-06-06 at 14:11 -0400, Joe Conway wrote:
> This discussion makes me wonder (though probably too late for the v16
> cycle) if we shouldn't treat "C" and "POSIX" locales to be a third
> provider, something like "internal".
That's exactly what I did in v6 of this series: I created a "non
On 6/6/23 09:09, Daniel Verite wrote:
Jeff Davis wrote:
New patch series attached. I plan to commit 0001 and 0002 soon, unless
there are objections.
0001 causes the "C" and "POSIX" locales to be treated with
memcmp/pg_ascii semantics in ICU, just like in libc. We also
considered a new "
Jeff Davis wrote:
> New patch series attached. I plan to commit 0001 and 0002 soon, unless
> there are objections.
>
> 0001 causes the "C" and "POSIX" locales to be treated with
> memcmp/pg_ascii semantics in ICU, just like in libc. We also
> considered a new "none" provider, but it's mor
Jeff Davis wrote:
> > #1
> >
> > postgres=# create database test1 locale='fr_FR.UTF-8';
> > NOTICE: using standard form "fr-FR" for ICU locale "fr_FR.UTF-8"
> > ERROR: new ICU locale (fr-FR) is incompatible with the ICU locale of
>
> I don't see a problem here. If you specify LOCALE to
On Mon, 2023-05-22 at 14:34 +0200, Peter Eisentraut wrote:
> Please put blank lines between
>
>
>
>
> etc., matching existing style.
>
> We usually don't capitalize the collation parameters like
>
> CREATE COLLATION mycollation1 (PROVIDER = icu, LOCALE = 'ja-JP);
>
> elsewhere in the documen
On 5/24/23 11:39, Jeff Davis wrote:
On Mon, 2023-05-22 at 22:09 +0200, Daniel Verite wrote:
In practice we're probably getting the "und" ICU locale whereas "fr"
would be appropriate.
This is a good point and illustrates that ICU is not a drop-in
replacement for libc in all cases.
I don't see
On Mon, 2023-05-22 at 22:09 +0200, Daniel Verite wrote:
> While I agree that the LOCALE option in CREATE DATABASE is
> counter-intuitive,
I think it's more than that. As Andreww Gierth pointed out:
$ initdb --locale=fr_FR
...
ICU locale: en-US
...
Is more than just counter-intuiti
Jeff Davis wrote:
> If we special case locale=C, but do nothing for locale=fr_FR, then I'm
> not sure we've solved the problem. Andrew Gierth raised the issue here,
> which he called "maximally confusing":
>
> https://postgr.es/m/874jp9f5jo@news-spur.riddles.org.uk
>
> That's why I f
On Mon, 2023-05-22 at 14:27 +0200, Peter Eisentraut wrote:
> The rules are for setting whatever sort order you like. Maybe you
> want
> to sort + before - or whatever. It's like, if you don't like it,
> build
> your own.
A build-your-own feature is fine, but it's not completely zero cost.
The
On Thu, 2023-05-11 at 13:07 +0200, Peter Eisentraut wrote:
> Here is my proposed patch for this.
The commit message makes it sound like lc_collate/ctype are completely
obsolete, and I don't think that's quite right: they still represent
the server environment, which does still matter in some cases
On Thu, 2023-05-11 at 13:09 +0200, Peter Eisentraut wrote:
> There is also the deterministic flag and the icurules setting.
> Depending on what level of detail you imagine the user needs, you
> really
> do need to look at the whole picture, not some subset of it.
(Nit: all database default colla
On 18.05.23 00:59, Jeff Davis wrote:
On Tue, 2023-05-16 at 20:23 -0700, Jeff Davis wrote:
Other than that, and I took your suggestions almost verbatim. Patch
attached. Thank you!
Attached new patch with a typo fix and a few other edits. I plan to
commit soon.
Some small follow-up on this pat
On 18.05.23 19:55, Jeff Davis wrote:
On Wed, 2023-05-17 at 19:59 -0400, Jonathan S. Katz wrote:
I did a quicker read through this time. LGTM overall. I like what you
did with the explanations around sensitivity (now it makes sense).
Committed, thank you.
There are a few things I don't underst
On Fri, 2023-05-19 at 21:13 +0200, Daniel Verite wrote:
> ISTM that if we want to go that route, we need the make the minimum
> changes at the user interface level and not any deeper, so that when
> (locale="C" OR locale="POSIX") AND the provider has not been
> specified,
> then the command (initdb
Jeff Davis writes:
> Committed, thank you.
This commit has given the PDF docs build some indigestion:
Making portrait pages on A4 paper (210mmx297mm)
/home/postgres/bin/fop -fo postgres-A4.fo -pdf postgres-A4.pdf
[WARN] FOUserAgent - Font "Symbol,normal,700" not found. Substituting with
"Symbol
Jeff Davis wrote:
> 2) Automatically change the provider to libc when locale=C.
>
> Almost works, but it's not clear how we handle the case "provider=icu
> lc_collate='fr_FR.utf8' locale=C".
>
> If we change it to "provider=libc lc_collate=C", we've overridden the
> specified lc_collate.
On Thu, 2023-05-18 at 20:11 +0200, Matthias van de Meent wrote:
> As I complain about in [0], since 5cd1a5af --no-locale has been
> broken
> / bahiving outside it's description: Instead of being equivalent to
> `--locale=C` it now also overrides `--locale-provider=libc`,
> resulting
> in undocument
On Thu, 2023-05-18 at 13:58 -0400, Jonathan S. Katz wrote:
> From my read of them, as an app developer I'd be very unlikely to
> use
> this. Maybe there is something with building out some collation rules
> vis-a-vis an extension, but I have trouble imagining the use-case. I
> may
> also not be
On Fri, 21 Apr 2023 at 22:46, Jeff Davis wrote:
>
> On Fri, 2023-04-21 at 19:00 +0100, Andrew Gierth wrote:
> > > > > >
> > Also, somewhere along the line someone broke initdb --no-locale,
> > which
> > should result in C locale being the default everywhere, but when I
> > just
> > tested it it pi
On 5/18/23 1:55 PM, Jeff Davis wrote:
On Wed, 2023-05-17 at 19:59 -0400, Jonathan S. Katz wrote:
I did a quicker read through this time. LGTM overall. I like what you
did with the explanations around sensitivity (now it makes sense).
Committed, thank you.
\o/
There are a few things I don't
On Wed, 2023-05-17 at 19:59 -0400, Jonathan S. Katz wrote:
> I did a quicker read through this time. LGTM overall. I like what you
> did with the explanations around sensitivity (now it makes sense).
Committed, thank you.
There are a few things I don't understand that would be good to
document be
On 5/17/23 6:59 PM, Jeff Davis wrote:
On Tue, 2023-05-16 at 20:23 -0700, Jeff Davis wrote:
Other than that, and I took your suggestions almost verbatim. Patch
attached. Thank you!
Attached new patch with a typo fix and a few other edits. I plan to
commit soon.
I did a quicker read through th
On Tue, 2023-05-16 at 20:23 -0700, Jeff Davis wrote:
> Other than that, and I took your suggestions almost verbatim. Patch
> attached. Thank you!
Attached new patch with a typo fix and a few other edits. I plan to
commit soon.
Regards,
Jeff Davis
From d0d2375fa55618b60f361f6bb64b2c494901
On Tue, 2023-05-16 at 15:35 -0400, Jonathan S. Katz wrote:
> + Sensitivity when determining equality, with
> + level1 the least sensitive and
> + identic the most sensitive. See + linkend="icu-collation-levels"/> for details.
>
> This discusses equality sensiti
On 5/5/23 8:25 PM, Jeff Davis wrote:
On Fri, 2023-04-21 at 20:12 -0400, Robert Haas wrote:
On Fri, Apr 21, 2023 at 5:56 PM Jeff Davis wrote:
Most of the complaints seem to be complaints about v15 as well, and
while those complaints may be a reason to not make ICU the default,
they are also an
On Tue, 2023-05-16 at 19:00 +0300, Alexander Lakhin wrote:
> I'm not sure about the proposed change in icu_from_uchar(). It seems
> that
> len_result + 1 bytes should always be enough for the result string
> terminated
> with NUL. If that's not true (we want to protect from some ICU bug
> here),
>
Hi Jeff,
16.05.2023 00:03, Jeff Davis wrote:
On Sat, 2023-05-13 at 13:00 +0300, Alexander Lakhin wrote:
On the current master (after 455f948b0, and before f7faa9976, of
course)
I get an ASAN-detected failure with the following query:
CREATE COLLATION col (provider = icu, locale = '123456789012'
On Mon, 2023-05-08 at 14:59 -0700, Jeff Davis wrote:
> The easiest thing to do is revert it for now, and after we sort out
> the
> memcmp() path for the ICU provider, then I can commit it again (after
> that point it would just be code cleanup and should have no
> functional
> impact).
The convers
On Sat, 2023-05-13 at 13:00 +0300, Alexander Lakhin wrote:
> On the current master (after 455f948b0, and before f7faa9976, of
> course)
> I get an ASAN-detected failure with the following query:
> CREATE COLLATION col (provider = icu, locale = '123456789012');
>
Thank you for the report!
ICU sou
On 11.05.23 23:29, Jeff Davis wrote:
New patch series attached.
=== 0001: fix bug that allows creating hidden collations
Bug:
https://www.postgresql.org/message-id/051c9395cf880307865ee8b17acdbf7f838c1e39.ca...@j-davis.com
This is still being debated in the other thread. Not really related t
Hello Jeff,
09.05.2023 00:59, Jeff Davis wrote:
The easiest thing to do is revert it for now, and after we sort out the
memcmp() path for the ICU provider, then I can commit it again (after
that point it would just be code cleanup and should have no functional
impact).
On the current master (a
New patch series attached.
=== 0001: fix bug that allows creating hidden collations
Bug:
https://www.postgresql.org/message-id/051c9395cf880307865ee8b17acdbf7f838c1e39.ca...@j-davis.com
=== 0002: handle some kinds of libc-stlye locale strings
ICU used to handle libc locale strings like 'fr_FR@e
On 09.05.23 17:09, Jeff Davis wrote:
It's awkward for a user to read pg_database.datlocprovider, then
depending on that, either look in datcollate or daticulocale. (It's
awkward in the code, too.)
Maybe some built-in function that returns a tuple of the default
provider, the locale, and the vers
On 09.05.23 10:25, Alvaro Herrera wrote:
On 2023-Apr-24, Peter Eisentraut wrote:
The GUC settings lc_collate and lc_ctype are from a time when those locale
settings were cluster-global. When we made those locale settings
per-database (PG 8.4), we kept them as read-only. As of PG 15, you can u
On Tue, 2023-05-09 at 10:25 +0200, Alvaro Herrera wrote:
> I agree with removing these in v16, since they are going to become
> more
> meaningless and confusing.
Agreed, but it would be nice to have an alternative that does the right
thing.
It's awkward for a user to read pg_database.datlocprovid
On 2023-Apr-24, Peter Eisentraut wrote:
> The GUC settings lc_collate and lc_ctype are from a time when those locale
> settings were cluster-global. When we made those locale settings
> per-database (PG 8.4), we kept them as read-only. As of PG 15, you can use
> ICU as the per-database locale pr
On Mon, 2023-05-08 at 17:47 -0400, Tom Lane wrote:
> -ERROR: could not convert locale name "C" to language tag:
> U_ILLEGAL_ARGUMENT_ERROR
> +NOTICE: using standard form "en-US-u-va-posix" for locale "C"
...
> I suppose this is environment-dependent. Sadly, the buildfarm
> client does not show
Jeff Davis writes:
> === 0001: do not convert C to en-US-u-va-posix
> I plan to commit this soon.
Several buildfarm animals have failed since this went in. The
only one showing enough info to diagnose is siskin [1]:
@@ -1043,16 +1043,15 @@
ERROR: ICU locale "nonsense-nowhere" has unknown lan
On Fri, 2023-04-21 at 20:12 -0400, Robert Haas wrote:
> On Fri, Apr 21, 2023 at 5:56 PM Jeff Davis wrote:
> > Most of the complaints seem to be complaints about v15 as well, and
> > while those complaints may be a reason to not make ICU the default,
> > they are also an argument that we should con
On Fri, 2023-04-28 at 14:35 -0700, Jeff Davis wrote:
> On Thu, 2023-04-27 at 14:23 +0200, Daniel Verite wrote:
> > This should be pg_strcasecmp(...) == 0
>
> Good catch, thank you! Fixed in updated patches.
Rebased patches.
=== 0001: do not convert C to en-US-u-va-posix
I plan to commit this so
On Thu, 2023-04-27 at 14:23 +0200, Daniel Verite wrote:
> This should be pg_strcasecmp(...) == 0
Good catch, thank you! Fixed in updated patches.
> postgres=# create database lat9 locale 'fr_FR@euro' encoding LATIN9
> template
> 'template0';
> ERROR: could not convert locale name "fr_FR@euro" to
Jeff Davis wrote:
> Attached are a few small patches:
>
> 0001: don't convert C to en-US-u-va-posix
> 0002: handle locale C the same regardless of the provider, as you
> suggest above
> 0003: make LOCALE (or --locale) apply to everything including ICU
Testing this briefly I noticed
On Fri, 2023-04-21 at 22:35 +0100, Andrew Gierth wrote:
> > > > >
> Can lc_collate_is_c() be taught to check whether an ICU locale is
> using
> POSIX collation?
Attached are a few small patches:
0001: don't convert C to en-US-u-va-posix
0002: handle locale C the same regardless of the provid
"Daniel Verite" writes:
> FTR the full text search parser still uses the libc functions
> is[w]space/alpha/digit... that depend on lc_ctype, whether the db
> collation provider is ICU or not.
Yeah, those aren't even connected up to the collation-selection
mechanisms; lots of work to do there. I
Jeff Davis wrote:
> > (I'm not sure whether those operations can get redirected to ICU
> > today
> > or whether they still always go to libc, but we'll surely want to fix
> > it eventually if the latter is still true.)
>
> Those operations do get redirected to ICU today.
FTR the full te
On Fri, 2023-04-21 at 16:00 -0400, Tom Lane wrote:
> I think I might like this idea, except for one thing: you're
> imagining
> that the locale doesn't control anything except string comparisons.
> What about to_upper/to_lower, character classifications in regexes,
> etc?
If provider='libc' and LC
On 22.04.23 01:00, Jeff Davis wrote:
On Fri, 2023-04-21 at 16:33 -0400, Robert Haas wrote:
And the fact that "C" or "POSIX" gets transformed into
"en-US-u-va-posix"
I already expressed, on reflection, that we should probably just not do
that. So I think we're in agreement on this point; patch
On 21.04.23 19:14, Peter Eisentraut wrote:
On 21.04.23 19:09, Sandro Santilli wrote:
On Fri, Apr 21, 2023 at 11:48:51AM -0400, Tom Lane wrote:
"Regina Obe" writes:
https://trac.osgeo.org/postgis/ticket/5375
If they actually are using locale C, I would say this is a bug.
That should designa
On Fri, Apr 21, 2023 at 5:56 PM Jeff Davis wrote:
> Most of the complaints seem to be complaints about v15 as well, and
> while those complaints may be a reason to not make ICU the default,
> they are also an argument that we should continue to learn and try to
> fix those issues because they exis
On Fri, 2023-04-21 at 16:33 -0400, Robert Haas wrote:
> And the fact that "C" or "POSIX" gets transformed into
> "en-US-u-va-posix"
I already expressed, on reflection, that we should probably just not do
that. So I think we're in agreement on this point; patch attached.
Regards,
Jeff Davi
> > My opinion is that the switch to using ICU by default is ill-advised
> > and should be reverted.
>
> Most of the complaints seem to be complaints about v15 as well, and while
> those complaints may be a reason to not make ICU the default, they are also
> an argument that we should continue to
On Fri, 2023-04-21 at 16:33 -0400, Robert Haas wrote:
> My opinion is that the switch to using ICU by default is ill-advised
> and should be reverted.
Most of the complaints seem to be complaints about v15 as well, and
while those complaints may be a reason to not make ICU the default,
they are a
> "Jeff" == Jeff Davis writes:
>> Is that the right fix, though? (It forces --locale-provider=libc for
>> the cluster default, which might not be desirable?)
Jeff> For the "no locale" behavior (memcmp()-based) the provider needs
Jeff> to be libc. Do you see an alternative?
Can lc_collat
On Fri, 2023-04-21 at 22:08 +0100, Andrew Gierth wrote:
> > > > >
> Is that the right fix, though? (It forces --locale-provider=libc for
> the
> cluster default, which might not be desirable?)
For the "no locale" behavior (memcmp()-based) the provider needs to be
libc. Do you see an alternative?
> "Jeff" == Jeff Davis writes:
>> Also, somewhere along the line someone broke initdb --no-locale,
>> which should result in C locale being the default everywhere, but
>> when I just tested it it picked 'en' for an ICU locale, which is not
>> the right thing.
Jeff> Fixed, thank you.
Is
On Fri, 2023-04-21 at 16:00 -0400, Tom Lane wrote:
> Maybe this means we are not ready to do ICU-by-default in v16.
> It certainly feels like there might be more here than we want to
> start designing post-feature-freeze.
I don't see how punting to the next release helps. If the CREATE
DATABASE sy
On Fri, 2023-04-21 at 19:00 +0100, Andrew Gierth wrote:
> > > > >
> Also, somewhere along the line someone broke initdb --no-locale,
> which
> should result in C locale being the default everywhere, but when I
> just
> tested it it picked 'en' for an ICU locale, which is not the right
> thing.
Fi
On Fri, Apr 21, 2023 at 3:25 PM Jeff Davis wrote:
> I am also having second thoughts about accepting "C" or "POSIX" as an
> ICU locale and transforming it to "en-US-u-va-posix" in v16. It's not
> terribly useful (why not just use memcmp()?), it's not fast in my
> measurements (en-US is faster), so
On Fri, 2023-04-21 at 13:28 -0400, Tom Lane wrote:
> I am wondering however whether this doesn't mean that all our
> carefully
> coded fast paths for C locale just went down the drain.
The code still exists. You can test it by using the built-in collation
"C" which is correctly specified with coll
On Fri, 2023-04-21 at 21:14 +0200, Sandro Santilli wrote:
> And then runs:
>
> createdb --encoding=UTF-8 --template=template0 --lc-collate=C
>
> Should we tweak anything else to make the results predictable ?
You can specify --locale-provider=libc
Regards,
Jeff Davis
Jeff Davis writes:
> I have a couple ideas:
> 1. Introduce a "none" provider to separate the concept of C/POSIX
> locales from the libc provider. It's not really using a provider
> anyway, it's just using memcmp(), and I think it causes confusion to
> combine them. Saying "LOCALE_PROVIDER=none" i
On Fri, 2023-04-21 at 14:23 -0400, Tom Lane wrote:
> postgres=# CREATE DATABASE test1 TEMPLATE=template0 ENCODING = 'UTF8'
> LOCALE = 'C';
...
> test1 | postgres | UTF8 | icu | C |
> C | en-US | |
> (4 rows)
>
> Looks like the "pick en-US ev
On Fri, Apr 21, 2023 at 10:27:49AM -0700, Jeff Davis wrote:
> On Fri, 2023-04-21 at 19:09 +0200, Sandro Santilli wrote:
> > =# select version();
> > PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu
> > 11.3.0-1ubuntu1~22.04) 11.3.0, 64-bit
> > =# show lc_collate;
> > C
> >
> "Tom" == Tom Lane writes:
>> Also, somewhere along the line someone broke initdb --no-locale,
>> which should result in C locale being the default everywhere, but
>> when I just tested it it picked 'en' for an ICU locale, which is not
>> the right thing.
Tom> Confirmed:
Tom> $ LANG=
On Fri, Apr 21, 2023 at 07:14:13PM +0200, Peter Eisentraut wrote:
> On 21.04.23 19:09, Sandro Santilli wrote:
> > On Fri, Apr 21, 2023 at 11:48:51AM -0400, Tom Lane wrote:
> > > "Regina Obe" writes:
> > >
> > > > https://trac.osgeo.org/postgis/ticket/5375
> > >
> > > If they actually are using l
"Regina Obe" writes:
> CREATE DATABASE test1 TEMPLATE=template0 ENCODING = 'UTF8' LOCALE = 'C';
> Doesn't seem to work at least not under mingw64 anyway.
Hmm, doesn't work for me either:
$ LANG=en_US.utf8 initdb
The files belonging to this database system will be owned by user "postgres".
This u
> Yeah. My recommendation is just LOCALE:
>
> regression=# CREATE DATABASE test1 TEMPLATE=template0 ENCODING =
> 'UTF8' LOCALE = 'C'; CREATE DATABASE regression=# CREATE DATABASE test2
> TEMPLATE=template0 ENCODING = 'UTF8' ICU_LOCALE = 'C';
> NOTICE: using standard form "en-US-u-va-posix" for l
Andrew Gierth writes:
> "Peter" == Peter Eisentraut writes:
> Peter> If the database is created with locale provider ICU, then
> Peter> lc_collate does not apply here,
> Having lc_collate return a value which is silently being ignored seems
> to me rather hugely confusing.
It's not *completel
> "Peter" == Peter Eisentraut writes:
Peter> If the database is created with locale provider ICU, then
Peter> lc_collate does not apply here,
Having lc_collate return a value which is silently being ignored seems
to me rather hugely confusing.
Also, somewhere along the line someone broke
"Regina Obe" writes:
> Okay got it was on IRC with RhodiumToad and he suggested:
> CREATE DATABASE test2 TEMPLATE=template0 ENCODING = 'UTF8' LC_COLLATE = 'C'
> LC_CTYPE = 'C' ICU_LOCALE='C';
> Which gives expected result:
> SELECT '+' < '-' ; -- true
> but gives me a notice:
> NOTICE: usi
> > CREATE DATABASE test TEMPLATE=template0 ENCODING = 'UTF8'
> LC_COLLATE = 'C'
> > LC_CTYPE = 'C';
>
> As has been pointed out already, setting LC_COLLATE/LC_CTYPE is
> meaningless when the locale provider is ICU. You need to look at what ICU
> locale is being chosen, or force it with LOCALE =
1 - 100 of 109 matches
Mail list logo