Paul Eggert wrote, in response to my suggestion to filter grep output, not input, for "binary junk":>> We've done that already, if memory serves.
I don't think so :). The installed grep on the system I'm typing on right now is "grep (GNU grep) 3.0".I've not checked closely, but I believe that should be a fairly recent grep. I created a large file ("/tmp/pjbb") by concatenating: 1) a big plain ASCII file of C source code, 2) a small ELF executable, and 3) another big plain ASCII file of C source code. Then I grep'd in this big file for the string "p...@usa.net", which appeared twice in the first file of C source code, and once again in the second file of C source code. Here's what I see: ============================ *$* grep --version | head -1 grep (GNU grep) 3.0 *$* grep p...@usa.net /tmp/pjbb * p...@usa.net * p...@usa.net Binary file /tmp/pjbb matches *$* grep -a p...@usa.net /tmp/pjbb * p...@usa.net * p...@usa.net * p...@usa.net ============================ By default, grep sees the first two "p...@usa.net", then abandons the search before seeing the third such, when it first encounters the ELF binary. Using "grep -a" to ask grep to persist, it sees all three "p...@usa.net" strings. === My ancient home-brew hack that provides ASCII trimmed output when scanning binary files for ASCII strings, contains custom code to buffer the already scanned input, in order that it can then scan backwards, once it finds a match. The usual line oriented buffering doesn't work so well when the input file might have no, or at least infrequent, line breaks. -- Paul Jackson p...@usa.net