-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 According to Eric Blake on 9/29/2007 1:04 PM: > You've given me a good idea - I'll try instrumenting a version of m4 and > coming up with a good list of the most popular regex patterns in use by > autoconf (autoconf -t has limitations, since regex patterns tend to mess > up the quoting of the trace file).
Here's something a bit more telling. With the attached patch, and in the coreutils directory, $ M4_TRACE_FILE=~/m4.trace M4=~/m4/src/m4 autoconf $ wc m4.trace 62207 61314 666720 m4.trace $ sort -u m4.trace | wc 401 405 5619 $ sort <m4.trace | uniq -c |sort -n -k1,1 |tail -n 15 740 [\\''] 816 (.*) 863 ^[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_][abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_0123456789]*$ 1163 \(..\)$ 1163 \\ 1163 ^\(..\) 2244 [^a-zA-Z0-9_] 2324 [ ]+ 3242 4306 \\[`""] 5020 [`""] 5020 \\[\\$] 7376 [`$] 11504 @\(<:\|:>\|S|\|%:\)@ 11915 @&t@ Wow. 61 thousand compilations of a regular expression pattern, with only 405 unique patterns. Sounds like some definite speedups to m4 are possible if we were to cache compiled regular expressions and reuse them, rather than always compiling from scratch. Also, some of those most frequent patterns can be done with index() rather than regexp() in m4sugar, offering some speedups even without m4 improvements. - -- Don't work too hard, make some time for fun as well! Eric Blake [EMAIL PROTECTED] -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (Cygwin) Comment: Public key at home.comcast.net/~ericblake/eblake.gpg Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFG/qga84KuGfSFAYARAkTBAKDBUav7ddmiYKminHdxjc9Mc700JACgkaM+ wspVndSXvYUDNNmyXURQq54= =Bq+8 -----END PGP SIGNATURE-----
diff --git a/src/builtin.c b/src/builtin.c index dee2276..fa141b0 100644 --- a/src/builtin.c +++ b/src/builtin.c @@ -1922,6 +1922,8 @@ Warning: \\0 will disappear, use \\& instead in replacements")); } } +extern FILE *trace_file; + /*------------------------------------------. | Initialize regular expression variables. | `------------------------------------------*/ @@ -1973,6 +1975,8 @@ m4_regexp (struct obstack *obs, int argc, token_data **argv) init_pattern_buffer (&buf, ®s); msg = re_compile_pattern (regexp, strlen (regexp), &buf); + if (trace_file) + fprintf (trace_file, "%s\n", regexp); if (msg != NULL) { @@ -2033,6 +2037,8 @@ m4_patsubst (struct obstack *obs, int argc, token_data **argv) init_pattern_buffer (&buf, ®s); msg = re_compile_pattern (regexp, strlen (regexp), &buf); + if (trace_file) + fprintf (trace_file, "%s\n", regexp); if (msg != NULL) { diff --git a/src/m4.c b/src/m4.c index 2d5ced0..9cce4fc 100644 --- a/src/m4.c +++ b/src/m4.c @@ -318,6 +318,8 @@ process_file (const char *name) #define OPTSTRING "-B:D:EF:GH:I:L:N:PQR:S:T:U:d::eil:o:st:" #endif +FILE *trace_file; + int main (int argc, char *const *argv, char *const *envp) { @@ -338,6 +340,12 @@ main (int argc, char *const *argv, char *const *envp) retcode = EXIT_SUCCESS; atexit (close_stdin); + { + const char *name = getenv ("M4_TRACE_FILE"); + if (name) + trace_file = fopen(name, "a"); + } + include_init (); debug_init (); #ifdef USE_STACKOVF @@ -590,6 +598,8 @@ main (int argc, char *const *argv, char *const *envp) undivert_all (); } output_exit (); + if (trace_file) + fclose (trace_file); free_macro_sequence (); exit (retcode); }
_______________________________________________ M4-discuss mailing list M4-discuss@gnu.org http://lists.gnu.org/mailman/listinfo/m4-discuss