Paul Eggert wrote: > On 10/16/19 5:19 PM, Trent W. Buck wrote: > > I would expect "grep -Fw -e 4GB -e DDR4 --and" to print the same thing as > > > > grep -Fw 4GB | grep -Fw DDR4 | grep -Fw -e 4GB -e DDR4 -o > > You're right, it's not obvious. :-) > > It may be better to just pipe greps together, as you do now. That's simple > and fast enough for this relatively-uncommon case, and it's portable to all > greps.
I admit that most of the time, I want "grep --and" for a small dataset (<1MB computer_parts.txt), so it's merely a convenience. Sometimes I grep audit logs (~1TB uncompressed), which takes anywhere from 15 minutes to 3 days, depending on how I tweak my grep calls. In that case, each grep in the pipeline has to pay the costs to de-serialize input from the previous grep, and re-serialize output to the next grep. If the first grep matches (say) 200GB of the 1TB, that's can be a lot of overhead (I assume). I was basically hoping that if it was all in a single grep process, the de/serialization steps could be skipped completely. I think the buzzword for that is "zero-copy"? I've noticed "grep" is about 30% slower than either "grep -F" or "LC_COLLATE=C grep", because (I think) it avoids the costs of decoding from UTF-8 to Unicode and back. So I was basically expecting a similar saving from --and. I'm only speaking as an end user - I haven't dug through the grep source, so those expectations might be unrealistic, and implementing it might be painful/impossible. I figured I should at least ask :-) If your expert opinion is that it's a pain to implement (and maintain!) and there's not enough demand, then I'm OK with that. This is NOT something that's burning me every day. Regardless, I appreciate you taking the time to discuss it. :-) PS: Regarding portability, I'm personally not worried because when I need a GNUism badly enough (e.g. du --threshold), I can usually get permission to install the relevant GNU software, even if it's only into %APPDATA% or $HOME. PS: I noticed on bugs.gnu.org something about grep being single-threaded, which might mean "grep --and" would end up being SLOWER than the existing pipelines, since the kernel can distribute a pipeline's elements across multiple CPUs/cores.