On Tue, Sep 24, 2013 at 5:24 AM, Aharon Robbins wrote:
> Hi Jim.
>
> I should note that gawk uses its own regex, although it does rely
> on glibc for isspace / iswspace etc...
...
close 15440
thanks
I've pushed my grep patches, but chose to omit 4 multibyte space
characters from the list in the
# optional
printf '' | ./gawk '/.../' # your tests here. :-)
Much thanks!
> From: Jim Meyering
> Date: Mon, 23 Sep 2013 14:04:09 -0700
> Subject: Re: bug#15440: [PATCH] dfa: fix \s and \S to work for multibyte
> To: Aharon Robbins , 15...@debbugs.gnu.o
[using the right bug address, this time]
On Mon, Sep 23, 2013 at 11:26 AM, Aharon Robbins wrote:
> Hi.
>
>> $ printf '\x82\n' > in; ./grep -q '\S' in && echo match
>> match
>>
>> Now, require a back-reference (forcing switch from grep's DFA matcher
>> to use of the regex functions), and y
This one really surprised me.
Learning that multibyte \s and \S had been broken since grep-2.6 did
not make my day. But fixing it helped.
Here's how it started:
To demonstrate the (first)bug, set up to use a UTF8 locale:
export LC_ALL=en_US.UTF-8
then run this and note that it matches: