On Fri, 24 Jul 2009 09:41:41 -0400
erik quanstrom <quans...@quanstro.net> wrote:

> > I've just installed the plan9port as described here (
> > http://swtch.com/plan9port/man/man1/install.html) on a debian box.
> > I was comparing the speed of some commands between the plan9 and the GNU
> > version, and I get consistently poorer results for the plan9 ones.
> > 'grep' for example, is at least twice as slow as its GNU counterpart.
> 
> on my 64-bit system grepping through linux
> source, i do see the same performance difference
> you see.
> 
> ; pwd ; which grep
> /usr/src/linux-2.6.29-gentoo-r5
> /home/quanstro/plan9/bin/grep
> ; for(f in grep /bin/grep)find .|grep '\.[ch]'|
>       xargs time $f -i 'plan[         ]*9'>/dev/null|[2]
>       awk '{a+=$1; b+=$2; c+=$3} END {print a "\t" b "\t"c}' 
> 1.08  0.24    1.36
> 0.46  0.31    0.79
> 
>  but this is not a fair comparison.  gnu
> grep should be using ascii since none of
> the local env variables have been set while
> p9p grep is using utf-8.  let's level the playing
> field:
> 
> ; ; LANG=en_US.UTF-8 for(f in grep /bin/grep)find .|
>       grep '\.[ch]'|xargs time $f -i 'plan[   ]*9'>/dev/null|[2]
>       awk '{a+=$1; b+=$2; c+=$3} END {print a "\t" b "\t"c}' 
> 1.07  0.25    1.37
> 17.13 0.28    17.43
> 
> this is actually a great improvement.  gnu grep used to
> be 80x slower for utf-8 locales, now it's only 40x slower.
> 
> - erik
> 

Try LC_ALL=en_GB.UTF-8 for some wierd, wierd fun with gnu grep:

 $ wc -l deep-file-list 
470485 deep-file-list

 $ 9 grep ethan deep-file-list |wc -l
428065

 $ time grep ethan deep-file-list > /dev/null 

real    4m29.491s
user    4m29.366s
sys     0m0.080s

 $ time grep -F ethan deep-file-list > /dev/null 

real    4m27.740s
user    4m27.576s
sys     0m0.070s

 $ time awk '/ethan/ {print}' deep-file-list > /dev/null

real    0m2.597s
user    0m2.570s
sys     0m0.017s

 $ time sed -n /ethan/p deep-file-list > /dev/null 

real    0m0.294s
user    0m0.273s
sys     0m0.020s

 $ time 9 grep ethan deep-file-list > /dev/null 

real    0m0.155s
user    0m0.140s
sys     0m0.017s


Note fixed pattern and discarded output. Those are fairly average
timings. They rank gnu grep at 1700 times slower than unstripped p9p
grep. :) Note that awk and sed there are gnu awk and sed, and both are
operating under the same LC_ALL=en_GB.UTF-8 environment. Gnu sed comes
in at 900 times faster than gnu grep, and awk at 100 times.

I took some more timings after some correspondance with
bug-g...@gnu.org. I do recall gnu grep was twice as fast as p9p grep
when given a plain ascii environment, but I haven't kept other results.

I don't know if the gnu grep maintainers are looking for a fix, or even
if they consider this extreme slowness a problem at all. It didn't sound
like it when I corresponded with them, but I guess that could simply
mean they didn't want to discuss it.


-- 
Ethan Grammatikidis

Those who are slower at parsing information must
necessarily be faster at problem-solving.

Reply via email to