On Thu, Aug 8, 2019 at 12:55 AM Alexander Korotkov
<a.korot...@postgrespro.ru> wrote:
> On Wed, Aug 7, 2019 at 4:11 PM Alexander Korotkov
> <a.korot...@postgrespro.ru> wrote:
> > On Wed, Aug 7, 2019 at 2:25 PM Markus Winand <markus.win...@winand.at> 
> > wrote:
> > > I was playing around with JSON path quite a bit and might have found one 
> > > case where the current implementation doesn’t follow the standard.
> > >
> > > The functionality in question are the comparison operators except ==. 
> > > They use the database default collation rather then the standard-mandated 
> > > "Unicode codepoint collation” (SQL-2:2016 9.39 General Rule 12 c iii 2 D, 
> > > last sentence in first paragraph).
> >
> > Thank you for pointing!  Nikita is about to write a patch fixing that.
>
> Please, see the attached patch.
>
> Our idea is to not sacrifice "==" operator performance for standard
> conformance.  So, "==" remains per-byte comparison.  For consistency
> in other operators we compare code points first, then do per-byte
> comparison.  In some edge cases, when same Unicode codepoints have
> different binary representations in database encoding, this behavior
> diverges standard.  In future we can implement strict standard
> conformance by normalization of input JSON strings.

Previous version of patch has buggy implementation of
compareStrings().  Revised version is attached.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment: 0001-Use-Unicode-codepoint-collation-in-jsonpath-3.patch
Description: Binary data

Reply via email to