On Sat, 10 Jan 2015 15:43:02 -0800
Jim Meyering <j...@meyering.net> wrote:

> Colleagues Nima Aghdaii and Yuliy Pisetsky found and fixed a heap
> buffer overrun in grep's kwset.c.  The underlying bug, already hard to
> trigger, would most often result in a "mere" heap UMR (uninitialized
> memory read), but Yuliy constructed inputs (see the new test) that
> cause a buffer overrun.
> 
> I'm attaching two other related patches:
>   - grep: avoid false-positive UMR
>   - tests: add support for ASAN memory poisoning
> 
> I expect to push these by Monday.

How about the attachments instead for the second patch?
From 79e8c26cd3249ef39e4a4ab1288259cf40ff46a4 Mon Sep 17 00:00:00 2001
From: Yuliy Pisetsky <ypiset...@fb.com>
Date: Thu, 1 Jan 2015 15:36:55 -0800
Subject: [PATCH 2/2] grep -F: fix a heap buffer (read) overrun

grep's read buffer is often filled to its full size, except when
reading the final buffer of a file.  In that case, the number of
bytes read may be far less than the size of the buffer.  However, for
certain unusual pattern/text combinations, grep -F would mistakenly
examine bytes in that uninitialized region of memory when searching
for a match.  With carefully chosen inputs, one can cause grep -F to
read beyond the end of that buffer altogether.  This problem arose via
commit v2.18-90-g73893ff with the introduction of a more efficient
heuristic using what is now the memchr_kwset function. The use of
that function in bmexec_trans could leave TP much larger than EP,
and the subsequent call to bm_delta2_search would mistakenly access
beyond end of the main input read buffer.

* src/kwset.c (bmexec_trans): When TP reaches or exceeds EP,
do not call bm_delta2_search.
* tests/kwset-abuse: New file.
* tests/Makefile.am (TESTS): Add it.
* THANKS.in: Update.
* NEWS (Bug fixes): Mention it.

Prior to this patch, this command would trigger a UMR:

  printf %0360db 0 | valgrind src/grep -F $(printf %019dXb 0)

  Use of uninitialised value of size 8
     at 0x4142BE: bmexec_trans (kwset.c:657)
     by 0x4143CA: bmexec (kwset.c:678)
     by 0x414973: kwsexec (kwset.c:848)
     by 0x414DC4: Fexecute (kwsearch.c:128)
     by 0x404E2E: grepbuf (grep.c:1238)
     by 0x4054BF: grep (grep.c:1417)
     by 0x405CEB: grepdesc (grep.c:1645)
     by 0x405EC1: grep_command_line_arg (grep.c:1692)
     by 0x4077D4: main (grep.c:2570)

See the accompanying test for how to trigger the heap buffer overrun.

Thanks to Nima Aghdaii for testing and finding numerous
ways to break early iterations of this patch.
---
 NEWS              |  5 +++++
 THANKS.in         |  1 +
 src/kwset.c       |  5 ++++-
 tests/Makefile.am |  1 +
 tests/kwset-abuse | 32 ++++++++++++++++++++++++++++++++
 5 files changed, 43 insertions(+), 1 deletion(-)
 create mode 100755 tests/kwset-abuse

diff --git a/NEWS b/NEWS
index 975440d..3835d8d 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU grep NEWS                                    -*- outline 
-*-
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** Bug fixes
+
+  grep no longer reads from uninitialized memory or from beyond the end
+  of the heap-allocated input buffer.
+
 
 * Noteworthy changes in release 2.21 (2014-11-23) [stable]
 
diff --git a/THANKS.in b/THANKS.in
index aeaf516..624478d 100644
--- a/THANKS.in
+++ b/THANKS.in
@@ -62,6 +62,7 @@ Michael Aichlmayr                   mi...@nx.com
 Miles Bader                         mi...@ccs.mt.nec.co.jp
 Mirraz Mirraz                       mirr...@rambler.ru
 Nelson H. F. Beebe                  be...@math.utah.edu
+Nima Aghdaii                        naghd...@fb.com
 Olaf Kirch                          o...@ns.lst.de
 Paul Kimoto                         kim...@spacenet.tn.cornell.edu
 Péter Radics                        mitchn...@gmail.com
diff --git a/src/kwset.c b/src/kwset.c
index 4003c8d..e2a8e16 100644
--- a/src/kwset.c
+++ b/src/kwset.c
@@ -646,7 +646,8 @@ bmexec_trans (kwset_t kwset, char const *text, size_t size)
                   }
               }
           }
-        if (bm_delta2_search (&tp, ep, sp, len, trans, gc1, gc2, d1, kwset))
+        if (bm_delta2_search (&tp, text + size, sp, len, trans, gc1,
+                             gc2, d1, kwset))
           return tp - text;
       big_advance:;
       }
@@ -654,6 +655,8 @@ bmexec_trans (kwset_t kwset, char const *text, size_t size)
   /* Now we have only a few characters left to search.  We
      carefully avoid ever producing an out-of-bounds pointer. */
   ep = text + size;
+  if (ep < tp)
+    return -1;
   d = d1[U(tp[-1])];
   while (d <= ep - tp)
     {
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 2cba2cd..0508cd2 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -75,6 +75,7 @@ TESTS =                                               \
   inconsistent-range                           \
   invalid-multibyte-infloop                    \
   khadafy                                      \
+  kwset-abuse                                  \
   long-line-vs-2GiB-read                       \
   match-lines                                  \
   max-count-overread                           \
diff --git a/tests/kwset-abuse b/tests/kwset-abuse
new file mode 100755
index 0000000..6d8ec0c
--- /dev/null
+++ b/tests/kwset-abuse
@@ -0,0 +1,32 @@
+#! /bin/sh
+# Evoke a segfault in a hard-to-reach code path of kwset.c.
+# This bug affected grep versions 2.19 through 2.21.
+#
+# Copyright (C) 2015 Free Software Foundation, Inc.
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+fail=0
+
+# This test case chooses a haystack of size 260,000, since prodding
+# with gdb showed a reallocation slightly larger than that in fillbuf.
+# To reach the buggy code, the needle must have length < 1/11 that of
+# the haystack, and 10,000 is a nice round number that fits the bill.
+printf '%0260000dXy\n' 0 | grep -F $(printf %010000dy 0)
+
+test $? = 1 || fail=1
+
+Exit $fail
-- 
2.2.0

Reply via email to