On Sun, Oct 21, 2018 at 08:48:28AM -0500, David Wright wrote: > On Sun 21 Oct 2018 at 05:25:05 (-0500), Richard Owlett wrote: > > I wish a list of files with a specific extension in a directory which > > contain keywordA but not keywordB. Recursing down the directory tree > > was the primary objection to the MATE search tool. > > At last, a direct question! > > $ grep -L keywordB $(grep -l keywordA a-directory/*extension) > > Mix with quotes according to taste and needs.
That doesn't recurse (it only considers files at depth 1 in a single subdirectory), and it falls apart on filenames with whitespace. If we ignore the recursion part for a moment, I have a FAQ for the "match A but not B" part: https://mywiki.wooledge.org/BashFAQ/079 The specific example for this case (foo but NOT bar) is at the bottom: awk '/foo/{good=1} /bar/{good=0;exit} END{exit !good}' So, all we have to do is write the recursion and extension-filtering parts and link them together with the awk command. This is fairly straightforward with the standard tools. find . -type f -name '*.myext' -exec \ awk '/keywordA/{good=1} /keywordB/{good=0;exit} END{exit !good}' {} \; -print Testing: wooledg:~$ mkdir /tmp/x && cd "$_" wooledg:/tmp/x$ mkdir -p a/b/c a/b/d wooledg:/tmp/x$ echo keywordA > a/b/c/good.myext wooledg:/tmp/x$ echo keywordA keywordB > a/b/d/bad.myext wooledg:/tmp/x$ find . -type f -name '*.myext' -exec \ > awk '/keywordA/{good=1} /keywordB/{good=0;exit} END{exit !good}' {} \; > -print ./a/b/c/good.myext Now, the obvious unstated part of the question is that he will want keywordA and keywordB to be passed as parameters (although knowing him, he will require 17 messages to tell us this). This is where it actually gets "hard", because the obvious thing to do would be to change the quotes on the awk command and embed $1 and $2 in it directly. That is a TRAP. It's a code injection bug, because the parameters given by the user could contain code that is meaningful to awk, which would lead to unexpected results. For that part of the program, I refer you to: https://mywiki.wooledge.org/BashProgramming/05 I would use the "awk variables" approach for this one: #!/bin/sh if test "$#" != 2; then printf "usage: %s goodpat badpat\n" "$0" >&2 exit 1 fi find . -type f -name '*.myext' -exec \ awk -v goodpat="$1" -v badpat="$2" \ '$0 ~ goodpat {good=1} $0 ~ badpat {good=0;exit} END{exit !good}' {} \; \ -print And, testing: wooledg:/tmp/x$ set -- wordA wordB wooledg:/tmp/x$ find . -type f -name '*.myext' -exec \ > awk -v goodpat="$1" -v badpat="$2" \ > '$0 ~ goodpat {good=1} $0 ~ badpat {good=0;exit} END{exit !good}' {} \; \ > -print ./a/b/c/good.myext And then, the obvious next extension after THAT would be to make the filename extension a parameter. The shell part of that one is super easy (no code injection problems with find -name), so I won't bother showing it. At that point, the user interface becomes the real issue. Do you put the extension argument first, or last? Do you make it an option? Do you hardcode a default extension, or does the lack of a specified extension mean that you drop the -name filter altogether? Or do you give up the command line interface entirely, and go with a Tk dialog? But he'll never, ever, EVER be able to answer those questions, so we won't have to worry about it.