bug#22838: New 'Binary file' detection considered harmful

Marcello Perathoner Mon, 29 Feb 2016 09:15:25 -0800

On 02/28/2016 11:13 PM, Paul Eggert wrote:

These changes were put in partly due to security issues, not only having
to do with grep's internals (the old 'grep' would dump core sometimes
when given encoding errors), but also for the benefit of invokers
expecting properly encoded text.


To some extent we were stuck between a rock and a hard place here. No
matter what 'grep' does, it will do the wrong thing for some usages. But
overall we thought it better for grep's output to be valid text.


You are driving out demons by Beelzebub.

grep is a core component of every unix system. You cannot change thebehaviour or interface of such a fundamental tool without incurring insubstantial breakage. Keeping the old bug is far wiser than to fix itand introduce a new bug.

Copying faulty input to the output is a preferable failure mode todropping part of the expected output. People do not expect grep tovalidate their input but they do expect grep to produce a completeresult set.


A text file with encoding problems is a text file and not a binary file.

$ find /etc/ssl/certs/ | LANG= grep pem


Wouldn't the following be better?

find /etc/ssl/certs/ -name '*.pem'

I'm not doing that. That was just an example to show how grep now givesincorrect results.

Many more cases can be made: any process that feeds tainted(user-provided) strings to grep can now be made to fail. Eg. a processthat greps apache logs for known exploit signatures will now fail if theattacker sends a bogus user-agent string.





Regards

--
Marcello Perathoner

bug#22838: New 'Binary file' detection considered harmful

Reply via email to