Jim Finnerty wrote:
> SET client_encoding = WIN1252;
> [...]
> postgres=# SELECT * FROM locations WHERE location LIKE 'Franche-Comt__'; --
> the wildcard is applied byte by byte instead of character by character, so
> the 2-byte accented character is matched only by 2 '_'s
>location
Jim Finnerty wrote:
> Currently nondeterministic collations are disabled at the database level.
Deterministic ICU collations are also disabled.
> The cited reason was because of the lack of LIKE support and because certain
> catalog views use LIKE.
But the catalogs shouldn't use the d
Jim Finnerty wrote:
> For a UTF8 encoded, case-insensitive (nondeterministic), accent-sensitive
> ICU
> collation, a LIKE predicate can be used with a small transformation of the
> predicate, and the pattern can contain multi-byte characters:
>
> from:
>
> SELECT * FROM locations WHERE l
On 2019-03-18 00:19, Peter Eisentraut wrote:
> On 2019-03-11 21:36, Peter Eisentraut wrote:
>> Patches here. This will allow all the existing collation customization
>> options as well as the ones being proposed in this thread to work in
>> older ICU versions.
>
> This has been committed, and her
On 2019-03-08 11:09, Peter Eisentraut wrote:
> On 2019-03-07 20:04, Daniel Verite wrote:
>> With previous versions, we'd need to call ucol_setAttribute(),
>> with the attributes and values defined here:
>> http://icu-project.org/apiref/icu4c/ucol_8h.html
>> for instance to get colStrength=secondary
On 2019-03-07 20:04, Daniel Verite wrote:
> With previous versions, we'd need to call ucol_setAttribute(),
> with the attributes and values defined here:
> http://icu-project.org/apiref/icu4c/ucol_8h.html
> for instance to get colStrength=secondary:
> ucol_setAttribute(coll, UCOL_STRENGTH , UCOL_
Peter Eisentraut wrote:
> The problem is not the syntax but that the older ICU versions don't
> support the *functionality* of ks-level2 or colStrength=secondary. If
> you try it, you will simply get a normal case-sensitive behavior.
My bad, I see now that the "old locale extension synta
On 2019-03-05 18:48, Daniel Verite wrote:
>> Older ICU versions (<54) don't support all the locale customization
>> options, so many of my new tests in collate.icu.utf8.sql will fail on
>> older systems. What should we do about that? Have another extra test file?
> Maybe stick to the old-style sy
Peter Eisentraut wrote:
> Older ICU versions (<54) don't support all the locale customization
> options, so many of my new tests in collate.icu.utf8.sql will fail on
> older systems. What should we do about that? Have another extra test file?
Maybe stick to the old-style syntax for the
Peter Eisentraut wrote:
[v7-0001-Collations-with-nondeterministic-comparison.patch]
+GenericMatchText(const char *s, int slen, const char *p, int plen, Oid
collation)
{
+ if (collation && !lc_ctype_is_c(collation) && collation !=
DEFAULT_COLLATION_OID)
+ {
+pg_locale_tlocale =
On 2019-02-21 03:17, Peter Geoghegan wrote:
> I wonder if it would be better to break this into distinct commits?
I thought about that. Especially the planner/executor changes could be
done separately, sort of as a way to address the thread
"ExecBuildGroupingEqual versus collations". But I'm not
On Tue, Feb 19, 2019 at 6:47 AM Peter Eisentraut
wrote:
> Another patch to address merge conflicts.
Some remarks on this:
* Your draft commit message says:
> This patch makes changes in three areas:
>
> - CREATE COLLATION DDL changes and system catalog changes to support
> this new flag.
>
>
Peter Eisentraut wrote:
> Another patch.
+ks key), in order for such such collations to act in
a
s/such such/such/
+
+The pattern matching operators of all three kinds do not support
+nondeterministic collations. If required, apply a different collation
to
+the expr
On 16/01/2019 21:50, Peter Eisentraut wrote:
> On 16/01/2019 14:20, Daniel Verite wrote:
>> I've found another issue with aggregates over distinct:
>> the deduplication seems to ignore the collation.
>
> I have a fix for that. I'll send it with the next update.
Another patch. This fixes your is
On 16/01/2019 14:20, Daniel Verite wrote:
> I've found another issue with aggregates over distinct:
> the deduplication seems to ignore the collation.
I have a fix for that. I'll send it with the next update.
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 2
On 14/01/2019 15:37, Andreas Karlsson wrote:
>> Nondeterministic collations do address this by allowing canonically
>> equivalent code point sequences to compare as equal. You still need a
>> collation implementation that actually does compare them as equal; ICU
>> does this, glibc does not AFAICT
Peter Eisentraut wrote:
> > On a table with pre-existing contents, the creation of a unique index
> > does not seem to detect the duplicates that are equal per the
> > collation and different binary-wise.
>
> Fixed in the attached updated patch.
Check. I've found another issue with aggre
On 14/01/2019 13:23, Daniel Verite wrote:
> On a table with pre-existing contents, the creation of a unique index
> does not seem to detect the duplicates that are equal per the
> collation and different binary-wise.
Fixed in the attached updated patch.
--
Peter Eisentraut http://ww
Andreas Karlsson wrote:
> > Nondeterministic collations do address this by allowing canonically
> > equivalent code point sequences to compare as equal. You still need a
> > collation implementation that actually does compare them as equal; ICU
> > does this, glibc does not AFAICT.
>
> A
On 1/10/19 8:44 AM, Peter Eisentraut wrote:
On 09/01/2019 19:49, Andreas Karlsson wrote:
Maybe this is orthogonal and best handled elsewhere but have you when
working with string equality given unicode normalization forms[1] any
thought?
Nondeterministic collations do address this by allowing
Peter Eisentraut wrote:
> Here is an updated patch.
On a table with pre-existing contents, the creation of a unique index
does not seem to detect the duplicates that are equal per the
collation and different binary-wise.
postgres=# \d test3ci
Table "public.test3ci"
Colum
On 09/01/2019 22:01, Daniel Verite wrote:
>> I don't see anything wrong here. The collation says that both values
>> are equal, so which one is returned is implementation-dependent.
> Is it, but it's impractical if the product of seemingly the same GROUP BY
> flip-flops between its different valid
On 09/01/2019 19:49, Andreas Karlsson wrote:
> On 12/28/18 9:55 AM, Peter Eisentraut wrote:
>> Here is an updated patch.
>>
>> I have updated the naming to "deterministic", as discussed.
>
> Maybe this is orthogonal and best handled elsewhere but have you when
> working with string equality given
Peter Eisentraut wrote:
> > =# select n from (values ('été' collate "myfr"), ('ete')) x(n)
> > group by 1 order by 1 ;
> > n
> > -
> > ete
> > (1 row)
> >
> > =# select n from (values ('été' collate "myfr"), ('ete')) x(n)
> > group by 1 order by 1 desc;
> > n
> > -
>
On 12/28/18 9:55 AM, Peter Eisentraut wrote:
Here is an updated patch.
I have updated the naming to "deterministic", as discussed.
Maybe this is orthogonal and best handled elsewhere but have you when
working with string equality given unicode normalization forms[1] any
thought? I feel there
On 04/01/2019 17:05, Daniel Verite wrote:
> When using GROUP BY and ORDER BY on a field with a non-deterministic
> collation, this pops out:
>
> CREATE COLLATION myfr (locale='fr-u-ks-level1',
> provider='icu', deterministic=false);
>
> =# select n from (values ('été' collate "myfr"), ('ete'))
Peter Eisentraut wrote:
> Here is an updated patch.
When using GROUP BY and ORDER BY on a field with a non-deterministic
collation, this pops out:
CREATE COLLATION myfr (locale='fr-u-ks-level1',
provider='icu', deterministic=false);
=# select n from (values ('été' collate "myfr"), ('
Here is an updated patch.
I have updated the naming to "deterministic", as discussed.
I have fixed up support for the "name" type, added foreign key support,
psql, pg_dump support, more tests. There are a couple of TODOs in
bpchar support that I need to look into a bit more. But other than that
Tom Lane wrote:
> I don't really find it "natural" for equality to consider obviously
> distinct values to be equal.
According to https://www.merriam-webster.com/dictionary/natural
"natural" has no less than 15 meanings. The first in the list is
"based on an inherent sense of right and
Peter Geoghegan writes:
> The Unicode consortium calls our current behavior within comparisons
> "deterministic comparisons" -- it's something they're not so keen on:
> https://unicode.org/reports/tr10/#Deterministic_Comparison
> I suggest using their terminology for our current behavior.
Hm, it'
On Wed, Dec 19, 2018 at 6:36 AM Tom Lane wrote:
> I don't really find it "natural" for equality to consider obviously
> distinct values to be equal.
The Unicode consortium calls our current behavior within comparisons
"deterministic comparisons" -- it's something they're not so keen on:
https://
"Daniel Verite" writes:
> To me it seems more natural to find a name for the other behavior, the
> one that consists of overwriting the locale-sensitive equality with
> the result of the byte-wise comparison. AFAIU the flag is meant
> to say: "don't do that".
> Some ideas that come to mind:
> as
Peter Eisentraut wrote:
> So this doesn't actually make the collation case-insensitive or
> anything, it just allows a library-provided collation that is, say,
> case-insensitive to actually work that way.
That's great news!
> So maybe "insensitive" isn't the right name for this flag, bu
33 matches
Mail list logo