On Fri, 1 Jan 2016 21:22:54 -0800 Paul Eggert <egg...@cs.ucla.edu> wrote:
> Ouch, good point. I missed the possibility of a unibyte encoding where > not all bytes are valid unibyte characters. I installed the attached > additional patch to fix this, and to test for the bug I recently > introduced here. Thanks, I see that it is good idea, but I propose minor change for your fix. Perhaps, it will be what you want.
From d36cf4208363c0f56ff32d38a9fea422342036fe Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka <nori...@kcn.ne.jp> Date: Sat, 2 Jan 2016 00:20:43 +0900 Subject: [PATCH] grep: minor improvements to previous change * src/grep.c (skip_easy_bytes): Do nothing if the locale does not have any skippable character. * (buf_has_encoding_errors): Do nothing if all bytes are single byte character in the locale. --- src/grep.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/grep.c b/src/grep.c index a5f1fa2..d5a8183 100644 --- a/src/grep.c +++ b/src/grep.c @@ -535,6 +535,8 @@ skip_easy_bytes (char const *buf) the buffer end, but that's benign. */ char const *p; uword const *s; + if (! unibyte_mask) + return buf; for (p = buf; (uintptr_t) p % sizeof (uword) != 0; p++) if (to_uchar (*p) & unibyte_mask) return p; @@ -551,7 +553,7 @@ skip_easy_bytes (char const *buf) static bool buf_has_encoding_errors (char *buf, size_t size) { - if (! unibyte_mask) + if (unibyte_mask == (uword) -1) return false; mbstate_t mbs = { 0 }; -- 2.6.4