Hello,

On 13/08/18 03:51 PM, Assaf Gordon wrote:
I suspect there is an uninitialized memory access deep inside
regex_internal.c under very particular circumstances.

(continuation of https://lists.gnu.org/r/bug-gnulib/2018-08/msg00071.html )

I've pin-pointed the change that causes the segfault,
and this likely also affect glibc.

1. The input regex contains multibyte character with
   different uppper/lower case representation.
2. The input regex also contains a NUL character.
3. In regex_internal.c function build_wcs_upper_buffer(),
   the code was changed like so:

-       if (BE ((size_t) (mbclen + 2) > 2, 1))
+       if (BE (mbclen < (size_t) -2, 1))

And this changed code subtly treats case of "mbclen==0"
differently, which eventually leads to incorrect code flow,
and then to a crash.

In gnulib, this was changed long ago:
===
https://git.savannah.gnu.org/cgit/gnulib.git/commit/?id=8335a4d6
commit 8335a4d6c7b4448cd0bcb6d0bebf1d456bcfdb17
Date:   Mon Apr 10 06:43:33 2006 +0000

    Merge regex changes from libc, removing some of our
    POSIX-conformance changes that were rejected and redoing them in a
    less-intrusive way.
===

And recently it was ported back to glibc:
===
https://sourceware.org/git/?p=glibc.git;a=commit;h=eb04c213
commit eb04c21373e2a2885f3d52ff192b0499afe3c672
Date:   Wed Dec 20 09:47:44 2017 -0200

    posix: Sync gnulib regex implementation
===


To reproduce (using gnulib's code), try the following:

   git clone git://git.sv.gnu.org/sed.git
   cd sed
   ./bootstrap

This patch adds the old code vs new code with "#ifdef REGEX_FIX"

   patch -p1 < regex-internal-bug.patch
   ./configure --with-included-regex CFLAGS="-O0 -g"
   make
   printf "/\xe1\xbe\xbe\x5c\x00/I" > 1.sed

This will segfault:

 ./sed/sed -f 1.sed < /dev/null

Rebuild with the old code, will not segfault

 rm lib/regex.o ; make CFLAGS="-DREGEX_FIX"
 ./sed/sed -f 1.sed < /dev/null

====

Perhaps it is sufficient to just revert these two lines - but I'm
not sure if there will be other side effects.

Comments welcomed,
 - assaf
--- gnulib/lib/regex_internal.c	2018-08-24 17:16:59.161610807 -0600
+++ lib/regex_internal.c	2018-08-24 17:08:07.985496439 -0600
@@ -317,7 +317,11 @@
 	  mbclen = __mbrtowc (&wc,
 			      ((const char *) pstr->raw_mbs + pstr->raw_mbs_idx
 			       + byte_idx), remain_len, &pstr->cur_state);
+#ifdef REGEX_FIX
+	  if (BE (mbclen + 2 > 2, 1))
+#else
 	  if (BE (mbclen < (size_t) -2, 1))
+#endif
 	    {
 	      wchar_t wcu = __towupper (wc);
 	      if (wcu != wc)
@@ -386,7 +390,11 @@
 	else
 	  p = (const char *) pstr->raw_mbs + pstr->raw_mbs_idx + src_idx;
 	mbclen = __mbrtowc (&wc, p, remain_len, &pstr->cur_state);
+#ifdef REGEX_FIX
+	if (BE (mbclen + 2 > 2, 1))
+#else
 	if (BE (mbclen < (size_t) -2, 1))
+#endif
 	  {
 	    wchar_t wcu = __towupper (wc);
 	    if (wcu != wc)
@@ -409,6 +417,7 @@
 		    if (pstr->offsets == NULL)
 		      {
 			pstr->offsets = re_malloc (Idx, pstr->bufs_len);
+			memset (pstr->offsets, 0xBC, sizeof(Idx)*pstr->bufs_len);
 
 			if (pstr->offsets == NULL)
 			  return REG_ESPACE;

Reply via email to