bug#22945: Surprising behaviour (bug?) of zgrep in combination with the -f option and process substitutions

Paul Eggert Sat, 19 Mar 2016 12:15:12 -0700

Jim Meyering wrote:

Might be tricky to portably transform that NUL byte into something we
can embed in a command-line-specified search string. Is there even a
notation for that? I don't think so.


But NUL problems aside, this also should work, requiring alternation
in the regexp derived from input with two or more lines, but then
we'll have to escape embedded '|' bytes, too:

How about the attached patch instead? It uses a bigger hammer, which shouldaddress both issues.

>From a5c927ea71ccc86fbb90ab0ea6083bf3cdcd9472 Mon Sep 17 00:00:00 2001
From: Paul Eggert <egg...@cs.ucla.edu>
Date: Thu, 17 Mar 2016 13:08:06 -0700
Subject: [PATCH] zgrep: with -f SPECIAL, read SPECIAL just once

Problem reported by Fulvio Scapin in: http://bugs.gnu.org/22945
* NEWS: Document this.
* tests/zgrep-f: Add a test.
* zgrep.in (with_filename): With -f FILE, if FILE is stdin or not
a regular file, copy it into a temporary and use the temporary.
---
 NEWS          |  3 +++
 tests/zgrep-f |  8 ++++++++
 zgrep.in      | 45 ++++++++++++++++++++++++++++++++++++++-------
 3 files changed, 49 insertions(+), 7 deletions(-)

diff --git a/NEWS b/NEWS
index 6363d71..541ad94 100644
--- a/NEWS
+++ b/NEWS
@@ -35,6 +35,9 @@ GNU gzip NEWS                                    -*- outline -*-
   gzip -k -v no longer reports that files are replaced.
   [bug present since the beginning]
 
+  zgrep -f A B C no longer reads A more than once if A is not a regular file.
+  This better supports invocations like 'zgrep -f <(COMMAND) B C' in Bash.
+  [bug introduced in gzip-1.2]
 
 * Noteworthy changes in release 1.6 (2013-06-09) [stable]
 
diff --git a/tests/zgrep-f b/tests/zgrep-f
index a8eb746..9a86550 100755
--- a/tests/zgrep-f
+++ b/tests/zgrep-f
@@ -29,6 +29,14 @@ zgrep -f - haystack.gz < n > out 2>&1 || fail=1
 
 compare out n || fail=1
 
+if ${BASH_VERSION+:} false; then
+  set +o posix
+  # This failed with gzip 1.6.
+  cat n n >nn || framework_failure_
+  eval 'zgrep -h -f <(cat n) haystack.gz haystack.gz' >out || fail=1
+  compare out nn || fail=1
+fi
+
 # This failed with gzip 1.4.
 echo a-b | zgrep -e - > /dev/null || fail=1
 
diff --git a/zgrep.in b/zgrep.in
index c24be57..06baf38 100644
--- a/zgrep.in
+++ b/zgrep.in
@@ -55,6 +55,7 @@ files_with_matches=0
 files_without_matches=0
 no_filename=0
 with_filename=0
+pattmp=
 
 while test $# -ne 0; do
   option=$1
@@ -113,13 +114,34 @@ while test $# -ne 0; do
     # The pattern is coming from a file rather than the command-line.
     # If the file is actually stdin then we need to do a little
     # magic, since we use stdin to pass the gzip output to grep.
-    # Turn the -f option into an -e option by copying the file's
-    # contents into OPTARG.
-    case $optarg in
-    (" '-'" | " '/dev/stdin'" | " '/dev/fd/0'")
-      option=-e
-      optarg=" '"$(sed "$escape") || exit 2;;
-    esac
+    # Similarly if it is not a regular file, since it might be read repeatedly.
+    # In either of these two cases, copy the pattern into a temporary file,
+    # and use that file instead.  The pattern might contain null bytes,
+    # so we cannot simply switch to -e here.
+    if case $optarg in
+       (" '-'" | " '/dev/stdin'" | " '/dev/fd/0'")
+         :;;
+       (*)
+         eval "test ! -f$optarg";;
+       esac
+    then
+      if test -n "$pattmp"; then
+        eval "cat --$optarg" >>"$pattmp"
+        continue
+      fi
+      trap '
+        test -n "$pattmp" && rm -f "$pattmp"
+        (exit 2); exit 2
+      ' HUP INT PIPE TERM 0
+      if type mktemp >/dev/null 2>&1; then
+        pattmp=$(mktemp -t -- "zgrep.XXXXXX") || exit 2
+      else
+        set -C
+        pattmp=${TMPDIR-/tmp}/zgrep.$$
+      fi
+      eval "cat --$optarg" >"$pattmp"
+      optarg=' "$pattmp"'
+    fi
     have_pat=1;;
   (--h | --he | --hel | --help)
     echo "$usage" || exit 2
@@ -232,5 +254,14 @@ do
   test 126 -le $res && break
 done
 
+if test -n "$pattmp"; then
+  rm -f "$pattmp" || {
+    r=$?
+    test $r -lt 2 && r=2
+    test $res -lt $r && res=$r
+  }
+  trap - HUP INT PIPE TERM 0
+fi
+
 test 128 -le $res && kill -$(expr $res % 128) $$
 exit $res
-- 
2.5.0

bug#22945: Surprising behaviour (bug?) of zgrep in combination with the -f option and process substitutions

Reply via email to