On Mon, Nov 28, 2016 at 5:49 AM, Norihiro Tanaka wrote:
> Jim Meyering wrote:
>
>> I suspect this won't be the last word in this area, because it feels
>> like we should be able to adjust DFA's tables so that people using
>> such locales can retain DFA's efficiency without the bug in the
>> curre
Thanks for that DFA fix, which should be much better than the previous
workaround. I installed it into gnulib and installed the attached patch
into grep.
From 76348c87e73b37d44caec3bb5b24c33c1455ed96 Mon Sep 17 00:00:00 2001
From: Paul Eggert
Date: Mon, 28 Nov 2016 08:39:37 -0800
Subject: [PATCH
Jim Meyering wrote:
> I suspect this won't be the last word in this area, because it feels
> like we should be able to adjust DFA's tables so that people using
> such locales can retain DFA's efficiency without the bug in the
> current implementation.
Hi Jim,
It is a bug in dfa for period expre
Jim Meyering wrote:
> I suspect this won't be the last word in this area, because it feels
> like we should be able to adjust DFA's tables so that people using
> such locales can retain DFA's efficiency without the bug in the
> current implementation.
Hi Jim,
It is a bug in dfa for period expre
On Sun, Nov 20, 2016 at 9:53 PM, Jim Meyering wrote:
> On Sun, Nov 20, 2016 at 2:59 PM, Stephane Chazelas
> wrote:
>> 2016-11-20 21:50:28 +, Stephane Chazelas:
>>> $ locale charmap
>>> GB18030
>>> $ printf '\uC9\n' | grep '.*7' | hd
>>> 81 30 87 37 0a
On Sun, Nov 20, 2016 at 2:59 PM, Stephane Chazelas
wrote:
> 2016-11-20 21:50:28 +, Stephane Chazelas:
>> $ locale charmap
>> GB18030
>> $ printf '\uC9\n' | grep '.*7' | hd
>> 81 30 87 37 0a|.0.7.|
>> 0005
>>
>> U+00C9's encoding does end in t
2016-11-20 21:50:28 +, Stephane Chazelas:
> $ locale charmap
> GB18030
> $ printf '\uC9\n' | grep '.*7' | hd
> 81 30 87 37 0a|.0.7.|
> 0005
>
> U+00C9's encoding does end in the 0x37 byte (7 in ASCII and GB18030).
[...]
> Reproduced with 2.25
$ locale charmap
GB18030
$ printf '\uC9\n' | grep '.*7' | hd
81 30 87 37 0a|.0.7.|
0005
U+00C9's encoding does end in the 0x37 byte (7 in ASCII and GB18030).
$ printf '\uC9\n' | grep '.*0'
fails.
$ printf '\uC9\n' | grep -o '.*7'
returns wi