On Mon 22 Oct 2018 at 09:09:12 (-0400), Greg Wooledge wrote: > On Sun, Oct 21, 2018 at 08:48:28AM -0500, David Wright wrote: > > On Sun 21 Oct 2018 at 05:25:05 (-0500), Richard Owlett wrote: > > > I wish a list of files with a specific extension in a directory which > > > contain keywordA but not keywordB. Recursing down the directory tree > > > was the primary objection to the MATE search tool. ↑↑↑↑↑↑↑↑↑ > > > > At last, a direct question! > > > > $ grep -L keywordB $(grep -l keywordA a-directory/*extension) > > > > Mix with quotes according to taste and needs. > > That doesn't recurse (it only considers files at depth 1 in a single > subdirectory),
Specifically required by the OP. > and it falls apart on filenames with whitespace. Left as an exercise for the reader. > If we ignore the recursion part for a moment, I have a FAQ for the > "match A but not B" part: > > https://mywiki.wooledge.org/BashFAQ/079 > > The specific example for this case (foo but NOT bar) is at the bottom: > > awk '/foo/{good=1} /bar/{good=0;exit} END{exit !good}' > > So, all we have to do is write the recursion and extension-filtering > parts and link them together with the awk command. This is fairly > straightforward with the standard tools. > > find . -type f -name '*.myext' -exec \ > awk '/keywordA/{good=1} /keywordB/{good=0;exit} END{exit !good}' {} \; > -print > > > Testing: > > wooledg:~$ mkdir /tmp/x && cd "$_" > wooledg:/tmp/x$ mkdir -p a/b/c a/b/d > wooledg:/tmp/x$ echo keywordA > a/b/c/good.myext > wooledg:/tmp/x$ echo keywordA keywordB > a/b/d/bad.myext > wooledg:/tmp/x$ find . -type f -name '*.myext' -exec \ > > awk '/keywordA/{good=1} /keywordB/{good=0;exit} END{exit !good}' {} \; > > -print > ./a/b/c/good.myext > > > Now, the obvious unstated part of the question is that he will want > keywordA and keywordB to be passed as parameters (although knowing him, > he will require 17 messages to tell us this). > > This is where it actually gets "hard", because the obvious thing to do > would be to change the quotes on the awk command and embed $1 and $2 in > it directly. That is a TRAP. It's a code injection bug, because the > parameters given by the user could contain code that is meaningful to awk, > which would lead to unexpected results. > > For that part of the program, I refer you to: > > https://mywiki.wooledge.org/BashProgramming/05 > > I would use the "awk variables" approach for this one: > > #!/bin/sh > if test "$#" != 2; then > printf "usage: %s goodpat badpat\n" "$0" >&2 > exit 1 > fi > > find . -type f -name '*.myext' -exec \ > awk -v goodpat="$1" -v badpat="$2" \ > '$0 ~ goodpat {good=1} $0 ~ badpat {good=0;exit} END{exit !good}' {} \; \ > -print > > > And, testing: > > wooledg:/tmp/x$ set -- wordA wordB > wooledg:/tmp/x$ find . -type f -name '*.myext' -exec \ > > awk -v goodpat="$1" -v badpat="$2" \ > > '$0 ~ goodpat {good=1} $0 ~ badpat {good=0;exit} END{exit !good}' {} \; > > \ > > -print > ./a/b/c/good.myext > > > And then, the obvious next extension after THAT would be to make the > filename extension a parameter. The shell part of that one is super > easy (no code injection problems with find -name), so I won't bother > showing it. > > At that point, the user interface becomes the real issue. Do you > put the extension argument first, or last? Do you make it an option? > Do you hardcode a default extension, or does the lack of a specified > extension mean that you drop the -name filter altogether? Or do you > give up the command line interface entirely, and go with a Tk dialog? > > But he'll never, ever, EVER be able to answer those questions, so we > won't have to worry about it. No, but we all learn something from these posts; at least, I do. Cheers, David.