On Fri, May 23, 2014 at 8:31 PM, Norihiro Tanaka <nori...@kcn.ne.jp> wrote: > Thanks to Jim and for the new release. Let's just start with, for next > release I want to add further improvement to it. > > In dfaexec, DFA state is always 0 until have found potential match. So > we can improve matching there by continuing to use the transition table > without replacing it. > > I tested the patch, and got about 3x speed-up. > > $ yes j | head -10000000 >k > $ env LC_ALL=C time -p src/grep '\(a\|b\)' k > Before: real 1.12 user 1.06 sys 0.04 > After : real 0.39 user 0.34 sys 0.04 > > I also tested for non-utf8 multibyte locale. > $ env LC_ALL=ja_JP.eucJP time -p src/grep '\(a\|b\)' k > Before: real 1.41 user 1.35 sys 0.05 > After : real 0.38 user 0.32 sys 0.06 ...
Thank you for the patch, but with a 2.30GHz i7-3615QM using 1600MHz DDR3, I see a 40% decrease in performance in that first case: $ for i in grep src/grep; do env LC_ALL=C time -f %e $i '\(a\|b\)' k; done 0.40 0.65 and no change in the latter case. In each case, I used an input file 10 times larger: $ yes j | head -100000000 >k I've also compared on an i5-4250U w/1600MHz DDR3, where there is still a performance decrease, but only 19% this time (each time is the best of 5 trials): 0.43 0.53