bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated

2014-06-06 Thread Norihiro Tanaka
Paul Eggert wrote: > So it looks like your patch confers some advantage, but on my platform > almost all the speedup is achieved simply by switching to the system strstr. First, I tested on CentOS 5.10. Next, I tested on RHEL 6.5, and get result as same as you. strstr() on CentOS 5.10 may be too

bug#17715: [PATCH] dfa: build struct dfamust on demand

2014-06-06 Thread Norihiro Tanaka
If we don't use KWset, struct dfamust doesn't have to build. This patch make a change that it's built on demand. From 4432de50f8cff0485005794e2d12348de7cf7e11 Mon Sep 17 00:00:00 2001 From: Norihiro Tanaka Date: Fri, 6 Jun 2014 19:08:08 +0900 Subject: [PATCH] dfa: build struct dfamust on demand

bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated

2014-06-06 Thread Norihiro Tanaka
Arnold Robbins wrote: > Is strstr() even a good idea? dfa needs to be able to match NUL > bytes in the data. If this prevents that, then it's a problem. dfamust() doesn't change a result, so no problem. BTW, even before make this change, dfamusts is terminated by NUL byte. Thanks, Norihiro

bug#17715: [PATCH] dfa: build struct dfamust on demand

2014-06-06 Thread arnold
Norihiro Tanaka wrote: > If we don't use KWset, struct dfamust doesn't have to build. This patch > make a change that it's built on demand. Gawk doesn't use KWset - does this patch affect gawk? Thanks, Arnold

bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated

2014-06-06 Thread Paul Eggert
Norihiro Tanaka wrote: strstr() on CentOS 5.10 may be too old. Yes it is. But 'configure' is supposed to detect this. config.log should say something like this: configure:26283: checking whether strstr works in linear time configure:26357: gcc -std=gnu99 -o conftest -g -O2 conftest.c >&

bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated

2014-06-06 Thread Paul Eggert
Norihiro Tanaka wrote: BTW, even before make this change, dfamusts is terminated by NUL byte. Yes, that's right, this patch doesn't affect whether the DFA code works correctly on NUL bytes. There is a performance issue: if the pattern contains NUL bytes the DFA code doesn't operate as effici

bug#17715: [PATCH] dfa: build struct dfamust on demand

2014-06-06 Thread Norihiro Tanaka
Arnold Robbins wrote: > Gawk doesn't use KWset - does this patch affect gawk? Yes, but it will be especially happy for Gawk. No longer dfamust() is called in Gawk after this change, because its output isn't be used anywhere even before the change. So a case that dfamust() is very slow as bug#177

bug#17700: [PATCH] dfa: speed-up for a pattern that many atoms are catenated

2014-06-06 Thread Norihiro Tanaka
Paul Eggert wrote: > Could you please investigate why it is not occurring on CentOS 5.10? > For example, why does the attached program work? It should fail. Sorry, I don't update gnulib. After update, it was returned immediately. I have also confirmed speed-up in grep. Thanks, Norihiro

bug#17722: Makefile rule fix and cleanup patches

2014-06-06 Thread Jim Meyering
I nearly omitted the second, since using scripts for egrep and fgrep may be removed, but left it in on principle: set a good example. 0001-build-don-t-redirect-directly-to.patch Description: Binary data 0002-build-improve-rule-to-generate-egrep-fgrep-scripts.patch Description: Binary data

bug#17722: Makefile rule fix and cleanup patches

2014-06-06 Thread Paul Eggert
Jim Meyering wrote: using scripts for egrep and fgrep may be removed Let's not remove the scripts, as they're better on platforms where they're supported. Users of a script can more-easily understand and modify what it does, which is a better match for the GNU project's overarching goals.

bug#17722: Makefile rule fix and cleanup patches

2014-06-06 Thread Jim Meyering
On Fri, Jun 6, 2014 at 5:16 PM, Paul Eggert wrote: > Jim Meyering wrote: >> >> using scripts for egrep and fgrep may be removed > > > Let's not remove the scripts, as they're better on platforms where they're > supported. Users of a script can more-easily understand and modify what it > does, whi

bug#17722: Makefile rule fix and cleanup patches

2014-06-06 Thread Paul Eggert
Jim Meyering wrote: why wouldn't we want to keep the build rules simple and the same for everyone? We wouldn't want to do that if it entailed bloated and opaque executables on all platforms, or if it entailed too-complicated C programs on all platforms. We should be able to avoid both probl