Linda Walsh wrote:
I had one file that it bailed on saying it has an invalid UTF-8 encoding -- but the line was recursive starting from '.' -- and it didn't name the file
That's pretty vague. Can you reproduce that problem? I don't observe it: $ mkdir d $ printf 'a\200\n' >d/f $ printf 'b\200\n' >d/g $ grep -r a d Binary file d/f matches
"-a" doesn't work, BTW: Ishtar:/tmp> grep -a '\000\000' zeros Ishtar:/tmp> echo $? 1
That's the way 'grep' has always behaved. The regular expression '\0' matches the string "0", not the NUL byte.
Ishtar:/tmp> grep -P '\000\000' zeros Binary file zeros matches
I don't follow this example; perhaps some text was omitted? Anyway, -P has always treated files containing zeros as binary files too, ever since -P has been introduced. It's the same as without -P.
But there it is -- if grep wasn't meant to handle binary files, it wouldn't know to call 'zeroes' a binary file.
Obviously, grep *is* meant to handle binary files; it's documented to handle them in a particular way.
how can 'shuf' claim to work on input lines yet have this allowed: -z, --zero-terminated line delimiter is NUL, not newline.
I don't follow this point. -z is a nice feature; we don't want to get rid of it.
People argue to dumb down POSIX utils, because some corp wants to get a posix label but has a few shortcomings -- so they donate enough money and posix changes it's rules.
I'm afraid you've gone off the deep end here.