On 08.05.23 17:48, Peter Eisentraut wrote:
On 27.04.23 13:44, Daniel Verite wrote:
This collation has an empty pg_collation.collversion column, instead
of being set to the same value as "und-x-icu" to track its version.
The original patch implements this as an INSERT in which it would be
easy
On 27.04.23 13:44, Daniel Verite wrote:
This collation has an empty pg_collation.collversion column, instead
of being set to the same value as "und-x-icu" to track its version.
The original patch implements this as an INSERT in which it would be easy to
fix I guess, but in current HEAD it come
Peter Eisentraut wrote:
> COLLATE UNICODE
>
> instead of
>
> COLLATE "und-x-icu"
>
> or whatever it is, is pretty useful.
>
> So, attached is a small patch to add this.
This collation has an empty pg_collation.collversion column, instead
of being set to the same value as "un
On Tue, 2023-03-28 at 08:46 -0400, Joe Conway wrote:
> > As long as we still have to initialize the libc locale fields to
> > some
> > language, I think it would be less confusing to keep the ICU locale
> > on
> > the same language.
>
> I definitely agree with that.
Sounds good -- no changes then
On 3/28/23 06:07, Peter Eisentraut wrote:
On 23.03.23 21:16, Jeff Davis wrote:
Another thought: for ICU, do we want the default collation to be
UNICODE (root collation)? What we have now gets the default from the
environment, which is consistent with the libc provider.
But now that we have the
On 23.03.23 21:16, Jeff Davis wrote:
Another thought: for ICU, do we want the default collation to be
UNICODE (root collation)? What we have now gets the default from the
environment, which is consistent with the libc provider.
But now that we have the UNICODE collation, it makes me wonder if we
On Thu, 2023-03-23 at 13:16 -0700, Jeff Davis wrote:
> Another thought: for ICU, do we want the default collation to be
> UNICODE (root collation)? What we have now gets the default from the
> environment, which is consistent with the libc provider.
>
> But now that we have the UNICODE collation,
On Thu, 2023-03-09 at 11:23 -0800, Jeff Davis wrote:
> Looks good to me.
Another thought: for ICU, do we want the default collation to be
UNICODE (root collation)? What we have now gets the default from the
environment, which is consistent with the libc provider.
But now that we have the UNICODE
On 09.03.23 20:23, Jeff Davis wrote:
On Thu, 2023-03-09 at 11:21 +0100, Peter Eisentraut wrote:
How about this patch version?
Looks good to me.
Committed, after adding a test.
On Thu, 2023-03-09 at 11:21 +0100, Peter Eisentraut wrote:
> How about this patch version?
Looks good to me.
Regards,
Jeff Davis
m: Peter Eisentraut
Date: Thu, 9 Mar 2023 11:14:28 +0100
Subject: [PATCH v2] Add standard collation UNICODE
Discussion:
https://www.postgresql.org/message-id/flat/1293e382-2093-a2bf-a397-c04e8f83d...@enterprisedb.com
---
doc/src/sgml/charset.sgml | 31 ---
src/bin/initdb
On Wed, 2023-03-08 at 07:21 +0100, Peter Eisentraut wrote:
> On 04.03.23 19:29, Jeff Davis wrote:
> > It looks like the way you've handled this is by inserting the
> > collation
> > with collprovider=icu even if built without ICU support. I think
> > that's
> > a new case, so we need to make sure i
On 04.03.23 19:29, Jeff Davis wrote:
I do like your approach though because, if someone is using a standard
collation, I think "not built with ICU" (feature not supported) is a
better error than "collation doesn't exist". It also effectively
reserves the name "unicode".
By the way, speaking of
On 04.03.23 19:29, Jeff Davis wrote:
It looks like the way you've handled this is by inserting the collation
with collprovider=icu even if built without ICU support. I think that's
a new case, so we need to make sure it throws reasonable user-facing
errors.
It would look like this:
=> select *
Jeff Davis writes:
> On Sun, 2023-03-05 at 08:27 +1300, Thomas Munro wrote:
>> It's created for UTF-8 only, and UTF-8 sorts the same way as the
>> encoded code points, when interpreted as a sequence of unsigned char
>> by memcmp(), strcmp() etc. Seems right?
> Right, makes sense.
> Though in pr
On Sun, 2023-03-05 at 08:27 +1300, Thomas Munro wrote:
> It's created for UTF-8 only, and UTF-8 sorts the same way as the
> encoded code points, when interpreted as a sequence of unsigned char
> by memcmp(), strcmp() etc. Seems right?
Right, makes sense.
Though in principle, shouldn't someone us
On Sun, Mar 5, 2023 at 7:30 AM Jeff Davis wrote:
> Sorting by codepoint should be encoding-independent (i.e. decode to
> codepoint first); but the C collation is just strcmp, which is
> encoding-dependent. So is UCS_BASIC wrong today?
It's created for UTF-8 only, and UTF-8 sorts the same way as t
On Wed, 2023-03-01 at 11:09 +0100, Peter Eisentraut wrote:
> When collation support was added to PostgreSQL, we added UCS_BASIC,
> since that could easily be mapped to the C locale.
Sorting by codepoint should be encoding-independent (i.e. decode to
codepoint first); but the C collation is just
On 3/1/23 11:09, Peter Eisentraut wrote:
The SQL standard defines several standard collations. Most of them are
only of legacy interest (IMO), but two are currently relevant: UNICODE
and UCS_BASIC. UNICODE sorts by the default Unicode collation algorithm
specifications and UCS_BASIC sorts by
9 +0100
Subject: [PATCH] Add standard collation UNICODE
---
doc/src/sgml/charset.sgml | 30 +++---
src/bin/initdb/initdb.c | 10 +++---
2 files changed, 34 insertions(+), 6 deletions(-)
diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 3032392b80..13
20 matches
Mail list logo