bug#43577: wrong result for grep -io in turkish locale

2020-09-23 Thread Paul Eggert
On 9/23/20 6:47 PM, Norihiro Tanaka wrote: I attach the fix for the bug. Regex is fixed in Paul, thank you. Thanks, I had written a similar patch, and your patch helped me find a bug in what I wrote. The patch I wrote uses an auxiliary ok_fold table that lets fgrep_icase_charlen avoid calli

bug#43577: wrong result for grep -io in turkish locale

2020-09-23 Thread Norihiro Tanaka
I attach the fix for the bug. Regex is fixed in Paul, thank you. From 884c46aadbe6a2f7203f84d4173a515ca4ccf8de Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Thu, 24 Sep 2020 10:39:46 +0900 Subject: [PATCH] grep: fix ignore-case Turkish bug * src/grep.c (fgrep_icase_charlen): Do not assume

bug#43577: wrong result for grep -io in turkish locale

2020-09-23 Thread Paul Eggert
On 9/23/20 7:30 AM, Jim Meyering wrote: $ LC_ALL=tr_TR.utf8 src/grep -i a zsh: abort (core dumped) LC_ALL=tr_TR.utf8 src/grep -i a I can reproduce this bug. There seems to be a performance regression too. I'll look into it.

bug#43577: wrong result for grep -io in turkish locale

2020-09-23 Thread Jim Meyering
On Wed, Sep 23, 2020 at 6:24 AM Norihiro Tanaka wrote: > > In turkish locale, upper and lower case are mapped as following. > > U0049 <-> U0131 > U0069 <-> U0130 > > It's expected that both following test cases returns U0130, but later > returns nothing. > > $ printf '\304\260\n' >I # U0130 >

bug#43577: wrong result for grep -io in turkish locale

2020-09-23 Thread Norihiro Tanaka
In turkish locale, upper and lower case are mapped as following. U0049 <-> U0131 U0069 <-> U0130 It's expected that both following test cases returns U0130, but later returns nothing. $ printf '\304\260\n' >I # U0130 $ env LC_ALL=tr_TR.utf8 grep -i i I ? # U0130 $ env LC_ALL=tr_TR.utf8 gre