Thanks, Paul. I tried to clone and compile your latest changes from the Savannah repo but since some extra requirements are probably needed to compile from master branch (that are beyond my knowledge), I ended up not being able to validate it. Anyway, thanks for the correction and fix implementation!
Regards, Rodrigo On Sun, Sep 22, 2024 at 3:39 AM Paul Eggert <egg...@cs.ucla.edu> wrote: > On 2024-09-20 22:41, Paul Eggert wrote: > > I have the sneaking suspicion that the script is assuming properties of > > 'grep' that are not documented and that are not guaranteed. > > In looking into the code a bit more, I can see some places where that is > what is happening. > > A couple of things. > > First, grep 3.11 uses buffer sizes that depend on earlier files that it > has scanned, and this affects whether grep decides later files are > binary. This can lead to the sort of confusion that you mentioned. There > are performance reasons to think that grep should not grow buffer sizes > for later files merely because earlier files had very long lines, as > huge buffers can hurt performance; so I installed onto the development > repository on Savannah the first attached patch to fix that. As a side > effect this may fix the symptoms you observed. > > Second, 'grep' is not a good tool for determining whether a file is text > or binary, since the definition of "text" vs "binary" is > application-specific and grep's definition is suitable for 'grep' and > it's problematic to use it elsewhere. I installed the second attached > patch to try to document this better. > > Hope this helps. > > Boldly closing this bug as fixed; if I'm wrong we can reopen it.