> > unfortunately, the last i checked, gnu grep mallocs
> > for each byte of input when using a utf-8 locale.
> 
> that bug was fixed in gnu grep years ago,
> probably before you found and reported it.
> unfortunately, linux distributions were for
> many years not updating their copies of
> gnu grep to the latest version, so very few
> '/bin/grep's had the bug fix.

if i recall correctly, i found that in 2004 or 2005
and fixed it directly from the gnu.org source.
perhaps you remember something i don't.

in any event, it's still not really fixed.  utf-8
performance still sucks:

; grep --version >[2=1] | sed 1q
GNU grep 2.5.4
; time grep missingstring1 mail.tar
0.00u 0.00s 0.01r        grep missingstring1 mail.tar  # status=1
LANG=en_US.UTF-8 time grep missingstring1 mail.tar
0.44u 0.00s 0.53r        grep missingstring1 mail.tar  # status=1

- erik

Reply via email to