bug#22103: bug#20526: grep BUG: text file is detected as binary

2016-01-08 Thread Jim Meyering
On Fri, Jan 8, 2016 at 5:46 AM, Norihiro Tanaka wrote: > > On Wed, 6 Jan 2016 09:57:46 -0800 > Paul Eggert wrote: > >> On 01/06/2016 12:32 AM, Paul Eggert wrote: >> > I installed the attached patch, which fixed this performance bug for me. >> Whoops! I forgot to 'git add src/search.h' before comm

bug#20526: grep BUG: text file is detected as binary

2016-01-08 Thread Paul Eggert
Paul Eggert wrote: I missed the possibility of a unibyte encoding where not all bytes are valid unibyte characters. I found a significant performance problem related to that bug and bug fix, and installed the attached further patch 0001. Come to think of it, this issue should be in NEWS too,

bug#22103: bug#20526: grep BUG: text file is detected as binary

2016-01-08 Thread Norihiro Tanaka
On Wed, 6 Jan 2016 09:57:46 -0800 Paul Eggert wrote: > On 01/06/2016 12:32 AM, Paul Eggert wrote: > > I installed the attached patch, which fixed this performance bug for me. > Whoops! I forgot to 'git add src/search.h' before committing. We also need > the attached followup patch, which I ins

bug#20526: grep BUG: text file is detected as binary

2016-01-08 Thread Norihiro Tanaka
On Wed, 6 Jan 2016 09:57:46 -0800 Paul Eggert wrote: > On 01/06/2016 12:32 AM, Paul Eggert wrote: > > I installed the attached patch, which fixed this performance bug for me. > Whoops! I forgot to 'git add src/search.h' before committing. We also need > the attached followup patch, which I ins

bug#20526: grep BUG: text file is detected as binary

2016-01-06 Thread Jim Meyering
On Wed, Jan 6, 2016 at 9:57 AM, Paul Eggert wrote: > On 01/06/2016 12:32 AM, Paul Eggert wrote: >> >> I installed the attached patch, which fixed this performance bug for me. > > Whoops! I forgot to 'git add src/search.h' before committing. We also need > the attached followup patch, which I insta

bug#20526: grep BUG: text file is detected as binary

2016-01-06 Thread Paul Eggert
On 01/06/2016 12:32 AM, Paul Eggert wrote: I installed the attached patch, which fixed this performance bug for me. Whoops! I forgot to 'git add src/search.h' before committing. We also need the attached followup patch, which I installed. >From 5a71d9d4afc2ec1a7a2c6e5c3fac33709ddc6551 Mon Sep

bug#20526: grep BUG: text file is detected as binary

2016-01-06 Thread Paul Eggert
Paul Eggert wrote: grep -rP 'fed.*cba' . On my machine the above command is 125x slower with the new grep than the old one, which suggests some tuning is in order before releasing. (It's bogged down inside libpcre somewhere.) I installed the attached patch, which fixed this performance bug fo

bug#20526: grep BUG: text file is detected as binary

2016-01-05 Thread Paul Eggert
Norihiro Tanaka wrote: I see that it is good idea, but I propose minor change for your fix. Perhaps, it will be what you want. I think the problem here is that the code was not computing unibyte_mask correctly; that is, the comment for unibyte_mask is correct, and usage of unibyte_mask is co

bug#20526: grep BUG: text file is detected as binary

2016-01-05 Thread Kamil Dudka
On Wednesday 30 December 2015 19:25:04 Paul Eggert wrote: > I installed into Savannah a patch (attached) that should fix this problem in > typical cases, and am boldly marking the bug as done. Please give the fix a > try if you have the time. Thanks. Thanks for the fixup! I can confirm that it re

bug#20526: grep BUG: text file is detected as binary

2016-01-02 Thread Norihiro Tanaka
On Fri, 1 Jan 2016 21:22:54 -0800 Paul Eggert wrote: > Ouch, good point. I missed the possibility of a unibyte encoding where > not all bytes are valid unibyte characters. I installed the attached > additional patch to fix this, and to test for the bug I recently > introduced here. Thanks, I se

bug#20526: grep BUG: text file is detected as binary

2016-01-01 Thread Paul Eggert
Norihiro Tanaka wrote: why this check is applied in only multi-byte locale? Ouch, good point. I missed the possibility of a unibyte encoding where not all bytes are valid unibyte characters. I installed the attached additional patch to fix this, and to test for the bug I recently introduced h

bug#20526: grep BUG: text file is detected as binary

2016-01-01 Thread Paul Eggert
Norihiro Tanaka wrote: I get following output after apply the patch. Is it expected? $ printf 'a\na\377\na\n' | LANG=en_US.utf8 src/grep a a Binary file (standard input) matches Yes, it's expected. Thanks, this should be stated more clearly, so I installed the attached documentation patch.

bug#20526: grep BUG: text file is detected as binary

2016-01-01 Thread Norihiro Tanaka
On Thu, 31 Dec 2015 10:04:06 -0800 Paul Eggert wrote: > Yes, it's expected. Thanks, this should be stated more clearly, so I > installed the attached documentation patch. Thanks. By the way, why this check is applied in only multi-byte locale? e.g. if \200 is included in en_US.iso88591 which

bug#20526: grep BUG: text file is detected as binary

2015-12-31 Thread Norihiro Tanaka
On Wed, 30 Dec 2015 19:25:04 -0800 Paul Eggert wrote: > I installed into Savannah a patch (attached) that should fix this > problem in typical cases, and am boldly marking the bug as done. > Please give the fix a try if you have the time. Thanks. I get following output after apply the patch. Is

bug#20526: grep BUG: text file is detected as binary

2015-12-31 Thread Paul Eggert
Jim Meyering wrote: The combination of this and the grep -oP infloop fix make this look like a good time for a bug-fix release. If there are any other pending bug fixes or small+safe changes people would like to see included, please let us know. I have one major qualm about this: since 'grep' n

bug#20526: grep BUG: text file is detected as binary

2015-12-30 Thread Jim Meyering
On Wed, Dec 30, 2015 at 7:25 PM, Paul Eggert wrote: > I installed into Savannah a patch (attached) that should fix this problem in > typical cases, and am boldly marking the bug as done. Please give the fix a > try if you have the time. Thanks. Thank you! The combination of this and the grep -oP

bug#20526: grep BUG: text file is detected as binary

2015-12-30 Thread Paul Eggert
I installed into Savannah a patch (attached) that should fix this problem in typical cases, and am boldly marking the bug as done. Please give the fix a try if you have the time. Thanks. >From ba23b4ee721750399ede8933cf472e0c6aa6e37f Mon Sep 17 00:00:00 2001 From: Paul Eggert Date: Wed, 30 Dec 2