Thanks for tracking this bug down. I introduced the bug in 2006 when I noticed that the expression '(size_t) (mbclen + 2) > 2' can have undefined behavior on (admittedly unlikely) platforms where size_t is one bit narrower than int. (Such platforms have existed in the past - I even worked for a company that sold them! - though these days I expect they're rarely used.) I replaced the expression with 'mbclen < (size_t) -2' to avoid undefined behavior, but unfortunately my replacement was incorrect as it is not equivalent when mbclen == 0.

Please try the attached gnulib patch, which should fix the problem in a portable way. Modern GCC optimizes the clear code just as well as the confusing code, so we might as well write it clearly.
>From 17542682f92da94550e275a58316c9ad96724374 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Sat, 25 Aug 2018 00:35:05 -0700
Subject: [PATCH] regex: fix uninitialized memory access

Problem and draft fix reported by Assaf Gordon here:
https://lists.gnu.org/r/bug-gnulib/2018-08/msg00071.html
https://lists.gnu.org/r/bug-gnulib/2018-08/msg00142.html
I introduced this bug into gnulib in commit
8335a4d6c7b4448cd0bcb6d0bebf1d456bcfdb17 dated 2006-04-10.
* lib/regex_internal.c (build_wcs_upper_buffer):
Fix bug when mbrtowc returns 0.
---
 ChangeLog            | 11 +++++++++++
 lib/regex_internal.c |  4 ++--
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index acd3e2a05..da711a89d 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,14 @@
+2018-08-25  Paul Eggert  <egg...@cs.ucla.edu>
+
+	regex: fix uninitialized memory access
+	Problem and draft fix reported by Assaf Gordon here:
+	https://lists.gnu.org/r/bug-gnulib/2018-08/msg00071.html
+	https://lists.gnu.org/r/bug-gnulib/2018-08/msg00142.html
+	I introduced this bug into gnulib in commit
+	8335a4d6c7b4448cd0bcb6d0bebf1d456bcfdb17 dated 2006-04-10.
+	* lib/regex_internal.c (build_wcs_upper_buffer):
+	Fix bug when mbrtowc returns 0.
+
 2018-08-23  Bruno Haible  <br...@clisp.org>
 
 	getcwd: Add cross-compilation guesses.
diff --git a/lib/regex_internal.c b/lib/regex_internal.c
index 7f0083b91..b10588f1c 100644
--- a/lib/regex_internal.c
+++ b/lib/regex_internal.c
@@ -317,7 +317,7 @@ build_wcs_upper_buffer (re_string_t *pstr)
 	  mbclen = __mbrtowc (&wc,
 			      ((const char *) pstr->raw_mbs + pstr->raw_mbs_idx
 			       + byte_idx), remain_len, &pstr->cur_state);
-	  if (BE (mbclen < (size_t) -2, 1))
+	  if (BE (0 < mbclen && mbclen < (size_t) -2, 1))
 	    {
 	      wchar_t wcu = __towupper (wc);
 	      if (wcu != wc)
@@ -386,7 +386,7 @@ build_wcs_upper_buffer (re_string_t *pstr)
 	else
 	  p = (const char *) pstr->raw_mbs + pstr->raw_mbs_idx + src_idx;
 	mbclen = __mbrtowc (&wc, p, remain_len, &pstr->cur_state);
-	if (BE (mbclen < (size_t) -2, 1))
+	if (BE (0 < mbclen && mbclen < (size_t) -2, 1))
 	  {
 	    wchar_t wcu = __towupper (wc);
 	    if (wcu != wc)
-- 
2.17.1

Reply via email to