I found that for optimization purposes common cases in re_string_peek_byte
are missed if pstr->mbs_allocated is true, thus skipping them also for case
insensitive.

My solution would be to return re_string_peek_byte also if pstr->icase is
true. mbs_allocated is changed also depending on icase but I don't think we
want to change it globally, so I think my patch should be fine and not
affect optimization.

Please check the attachment. If the patch is fine should I also send this
to gnulib mailing list to have it submitted or is this one enough?

Best regards,
Tomasz Dziendzielski
From 73bc1e1baadfb9bc51b5e5bd56d474e4fa722491 Mon Sep 17 00:00:00 2001
From: Tomasz Dziendzielski <tomasz.dziendziel...@gmail.com>
Date: Tue, 19 Oct 2021 20:26:28 +0200
Subject: [PATCH] regex: Handle common (easiest) cases if case insensitive

Grep can miss some easy matches if re_string_peek_byte is skipped for
case insensitive.

Fixes bug 39678.

* lib/regex_internal.c (re_string_peek_byte_case): Run
re_string_peek_byte if pstr->icase

Signed-off-by: Tomasz Dziendzielski <tomasz.dziendziel...@gmail.com>
---
 lib/regex_internal.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/regex_internal.c b/lib/regex_internal.c
index aefcfa2f52e68c6a648d8c1434027ef25aec082f..ae44a75e7e0255455757a2330021c06f5d5ada84 100644
--- a/lib/regex_internal.c
+++ b/lib/regex_internal.c
@@ -843,7 +843,7 @@ re_string_peek_byte_case (const re_string_t *pstr, Idx idx)
   Idx off;
 
   /* Handle the common (easiest) cases first.  */
-  if (__glibc_likely (!pstr->mbs_allocated))
+  if (__glibc_likely (!pstr->mbs_allocated || pstr->icase))
     return re_string_peek_byte (pstr, idx);
 
 #ifdef RE_ENABLE_I18N
-- 
2.33.0

Reply via email to