-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [adding m4-patches; this branch of the thread can drop other lists]
According to Eric Blake on 9/29/2007 1:31 PM: > Here's something a bit more telling. With the attached patch, and in the > coreutils directory, > > $ M4_TRACE_FILE=~/m4.trace M4=~/m4/src/m4 autoconf I tweaked my tracer patch a bit to distinguish between patsubst and regexp. $ sort <m4.trace | uniq -c |sort -n -k1,1 |tail -n 15 ... 1214 p: 1596 ... So half of the empty lines in my trace actually did a multi-line regex. But 1214 of them did a patsubst(string, [], []), and m4 wasted time compiling the empty regex every one of those times! Applying this to m4, to add some benefit to autoconf < 2.62 vs. m4 > 1.4.10 (watch for a followup to autoconf that avoids the empty regex to begin with). 2007-09-29 Eric Blake <[EMAIL PROTECTED]> Optimize for Autoconf usage pattern. * src/builtin.c (m4_regexp, m4_patsubst): Handle empty regex faster. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG/udP84KuGfSFAYARAmHzAJwJO8+zwXssS/qlIEfotONpp/epRgCfQgQ3 Rjq/NWvO4ha9S+o3gpv9gdg= =BSOA -----END PGP SIGNATURE-----
>From aa46ced67010190918295b965f5e2879dcd9a30c Mon Sep 17 00:00:00 2001 From: Eric Blake <[EMAIL PROTECTED]> Date: Sat, 29 Sep 2007 17:48:29 -0600 Subject: [PATCH] Optimize for Autoconf usage pattern. * src/builtin.c (m4_regexp, m4_patsubst): Handle empty regex faster. Signed-off-by: Eric Blake <[EMAIL PROTECTED]> --- ChangeLog | 6 ++++++ src/builtin.c | 36 +++++++++++++++++++++++++++--------- 2 files changed, 33 insertions(+), 9 deletions(-) diff --git a/ChangeLog b/ChangeLog index 0cea5b5..f29b557 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,9 @@ +2007-09-29 Eric Blake <[EMAIL PROTECTED]> + + Optimize for Autoconf usage pattern. + * src/builtin.c (m4_regexp, m4_patsubst): Handle empty regex + faster. + 2007-09-24 Eric Blake <[EMAIL PROTECTED]> Create .gitignore alongside .cvsignore. diff --git a/src/builtin.c b/src/builtin.c index dee2276..65f4585 100644 --- a/src/builtin.c +++ b/src/builtin.c @@ -1968,8 +1968,19 @@ m4_regexp (struct obstack *obs, int argc, token_data **argv) return; } - victim = TOKEN_DATA_TEXT (argv[1]); - regexp = TOKEN_DATA_TEXT (argv[2]); + victim = ARG (1); + regexp = ARG (2); + repl = ARG (3); + + if (!*regexp) + { + /* The empty regex matches everything! */ + if (argc == 3) + shipout_int (obs, 0); + else + obstack_grow (obs, repl, strlen (repl)); + return; + } init_pattern_buffer (&buf, ®s); msg = re_compile_pattern (regexp, strlen (regexp), &buf); @@ -1993,10 +2004,7 @@ m4_regexp (struct obstack *obs, int argc, token_data **argv) else if (argc == 3) shipout_int (obs, startpos); else if (startpos >= 0) - { - repl = TOKEN_DATA_TEXT (argv[3]); - substitute (obs, victim, repl, ®s); - } + substitute (obs, victim, repl, ®s); free_pattern_buffer (&buf, ®s); } @@ -2013,6 +2021,7 @@ m4_patsubst (struct obstack *obs, int argc, token_data **argv) { const char *victim; /* first argument */ const char *regexp; /* regular expression */ + const char *repl; struct re_pattern_buffer buf; /* compiled regular expression */ struct re_registers regs; /* for subexpression matches */ @@ -2029,7 +2038,17 @@ m4_patsubst (struct obstack *obs, int argc, token_data **argv) return; } - regexp = TOKEN_DATA_TEXT (argv[2]); + victim = ARG (1); + regexp = ARG (2); + repl = ARG (3); + + /* The empty regex matches everywhere, but if there is no + replacement, we need not waste time with it. */ + if (!*regexp && !*repl) + { + obstack_grow (obs, victim, strlen (victim)); + return; + } init_pattern_buffer (&buf, ®s); msg = re_compile_pattern (regexp, strlen (regexp), &buf); @@ -2042,7 +2061,6 @@ m4_patsubst (struct obstack *obs, int argc, token_data **argv) return; } - victim = TOKEN_DATA_TEXT (argv[1]); length = strlen (victim); offset = 0; @@ -2073,7 +2091,7 @@ m4_patsubst (struct obstack *obs, int argc, token_data **argv) /* Handle the part of the string that was covered by the match. */ - substitute (obs, victim, ARG (3), ®s); + substitute (obs, victim, repl, ®s); /* Update the offset to the end of the match. If the regexp matched a null string, advance offset one more, to avoid -- 1.5.3.2
_______________________________________________ M4-discuss mailing list M4-discuss@gnu.org http://lists.gnu.org/mailman/listinfo/m4-discuss