Shell scripting can take file names in from a find or ls with 'while read', or by globbing 'for f in pattern', and examine them one by one, run 'grep -q' to find out if the file or uncompressed stream from that file has a match, and if so 'echo' the file name out, or if you want lines, it can 'while read l' the stream out of grep to prefix each line with a file name in an 'echo'. It helps to juggle steams not file names, create steams not temp files that have to be cleaned up and create delay. In bash, sometimes while read gets tricky as the variable(s) are local to the loop, so sometimes a parenthesis wrapper helps. Both ksh and bash also have the nice '<(command)' feature to turn streams of stdout into input file names, and '>(command)' for output streams to file names. Bash has so many nice tricks I often google for them, like if recognize pattern. If you do not trust extensions, you can '$(file filename)' to find out what you have in hand: $ echo $(file .profile).profile: ASCII textdgp@dgp-p6803w:~$
On Tuesday, April 23, 2024 at 11:21:26 AM EDT, Mary <maryc...@proton.me> wrote: > Thanks for the suggestion. You're right, this would be better than zgrep > etc. > > I have some qualms though, as the new option would increase the attack > surface for 'grep', in that you could then execute arbitrary code by > passing certain options to 'grep'. Is there some safer way to get what > you want? There is still the possibility of including the respective compression libraries directly in grep and using the `-Z` and `-J` as proposed, but this wouldn't allow to use less popular compression algorithms. One possibility, but I'm not sure what it's worth, would be to give grep a special arg0 to enable shell commands, like `jgrep zcat pattern123 file.gz`. But I'm not sure if it's worth the trouble. > One supposes that if the file extension is not trustworthy, one can taste > file like the file command, and use libraries like the gzip libraries to > handle gzipped files as a stream. There are so many others: zip files could > be treated like directories and all the files in them that match the glob > could be searched, and then there is bzip2, 7zip, .... It becomes a > popularity contest! One can do all this with shell scripting, and leave poor > old grep out of it! The reason why I wanted to do this in grep directly is because it's difficult to implement this with shell scripting. I noticed that neither zgrep, bzgrep nor xzgrep support the `-r` option, among others, presumably because it's too difficult to implement in a portable way. I made my patch use a shell command specifically to provide maximum flexibility with minimum maintenance cost. But it does open the door to security risks, so I understand if it's not worth adding to grep.