Since grep 2.21, grep fails to report matches in a UTF-8 file with a few
non-UTF-8 bytes interspersed. This is likely to be related to one of the
recent patches related to encoding or multi-byte issues I see in the
change log.
I have a number of large UTF-8 source files with some non-UTF-8 char
Paul Eggert wrote:
the mentioned patches are apparently intended to fix issues in
non-UTF-8 locales.
No, they're also needed for UTF-8 locales I'm afraid. There are some
security issues, not only having to do with grep's internals, but also
for the behavior of downstream programs that may be e
I wrote:
> > echo "A" | LC_ALL=C grep -i "[a]"
> > works as expected
> >
> > echo "A" | LC_ALL=en_US.UTF-8 grep -i "[a]"
> > does not work
Tim wrote:
> Works fine on Fedora Core 4.
OK, this was with sed 4.1.4 on SUSE Linux 10.0, I could have reported
this earlier, sorry.
These tools spill out warnings that are irritating almost everyone,
moreover, they are threatening to be removed.
There was some discussion about this earlier this year, but with no
consolidated outcome, so I'd like to raise the issue again.
Some weird and wrong arguments were presented; first, m
So what is bug #20768?
Am 04.09.2024 um 04:50 schrieb Dale Maggee:
I'm just curious as to the status of this near-decade-old patch? This
would be a super-useful feature which I'd use on a near-daily basis.
It appears the project made the submitter jump through its FSF
paperwork hoops, which he