Re: M4 1.4.10b [beta] released

Bruno Haible Thu, 28 Feb 2008 06:02:36 -0800

Hi Eric,

> > | Effects on "autoconf" in gettext/gettext-tools:
> > |
> > | m4-1.4.10      50.6 s real + 1.0 sec system
> > | m4-1.4.10b     42.1 s real + 0.9 sec system
> 
> And with --disable-assert, I get:
> 
>                    41.2 s real + 0.9 sec system


With the patch below, I get:

                     36.9 s real + 0.9 sec system

i.e. an additional 10% speedup.

The speed-relevant code is only the case for INPUT_CHAIN. I added the one
for INPUT_STRING because it may be useful it situations I don't know of.


2008-02-28  Bruno Haible  <[EMAIL PROTECTED]>

        Optimize a code path frequently exercised by autoconf.
        Memory impact: none.
        Speed impact: 10% speedup.
        * src/input.c (next_token): Add an optimized code path for the
        frequent case of rescanning chains or strings.

*** src/input.c.bak     2008-02-23 18:13:02.000000000 +0100
--- src/input.c 2008-02-28 14:52:02.000000000 +0100
***************
*** 1632,1637 ****
--- 1632,1741 ----
        quote_level = 1;
        while (1)
        {
+         if (curr_quote.len1 == 1 && curr_quote.len2 == 1 && !input_change
+             && isp)
+           {
+             /* The optimized case.  It heavily inlines the MATCH macro and
+                the next_char and next_char_1 functions, to the point that
+                the scan is a loop over a region of memory followed by a
+                simple memory copy operation.
+                The case with INPUT_CHAIN alone can speed up GNU autoconf
+                runs by 10%.  */
+             if (isp->type == INPUT_CHAIN)
+               {
+                 token_chain *chain = isp->u.u_c.chain;
+ 
+                 if (chain)
+                   {
+                     if (obs != NULL && current_quote_age
+                         && chain->quote_age == current_quote_age)
+                       {
+                         /* next_char () would return CHAR_QUOTE here.  */
+                         append_quote_token (obs, td);
+                       }
+                     else if (chain->type == CHAIN_STR && chain->u.u_s.len > 0)
+                       {
+                         unsigned char curr_quote_1 =
+                           to_uchar (curr_quote.str1[0]);
+                         unsigned char curr_quote_2 =
+                           to_uchar (curr_quote.str2[0]);
+                         const char *p;
+                         size_t n;
+                         size_t count;
+ 
+                         /* Partial consumption invalidates quote age.  */
+                         chain->quote_age = 0;
+ 
+                         p = chain->u.u_s.str;
+                         n = chain->u.u_s.len;
+                         do
+                           {
+                             unsigned char ch = to_uchar (*p);
+                             if (ch == curr_quote_2)
+                               {
+                                 if (--quote_level == 0)
+                                   break;
+                               }
+                             else
+                               quote_level += (ch == curr_quote_1);
+                             p++;
+                             n--;
+                           }
+                         while (n > 0);
+ 
+                         count = p - chain->u.u_s.str;
+                         if (count > 0)
+                           obstack_grow (obs_td, chain->u.u_s.str, count);
+                         count += (quote_level == 0);
+                         chain->u.u_s.str += count;
+                         chain->u.u_s.len -= count;
+                         if (quote_level == 0)
+                           break;
+                         continue;
+                       }
+                   }
+               }
+             else if (isp->type == INPUT_STRING)
+               {
+                 if (isp->u.u_s.len > 0)
+                   {
+                     unsigned char curr_quote_1 =
+                       to_uchar (curr_quote.str1[0]);
+                     unsigned char curr_quote_2 =
+                       to_uchar (curr_quote.str2[0]);
+                     const char *p = isp->u.u_s.str;
+                     size_t n = isp->u.u_s.len;
+                     size_t count;
+ 
+                     do
+                       {
+                         unsigned char ch = to_uchar (*p);
+                         if (ch == curr_quote_2)
+                           {
+                             if (--quote_level == 0)
+                               break;
+                           }
+                         else
+                           quote_level += (ch == curr_quote_1);
+                         p++;
+                         n--;
+                       }
+                     while (n > 0);
+ 
+                     count = p - isp->u.u_s.str;
+                     if (count > 0)
+                       obstack_grow (obs_td, isp->u.u_s.str, count);
+                     count += (quote_level == 0);
+                     isp->u.u_s.str += count;
+                     isp->u.u_s.len -= count;
+                     if (quote_level == 0)
+                       break;
+                     continue;
+                   }
+               }
+           }
+ 
+         /* The general case.  Proceed character by character.  */
          ch = next_char (obs != NULL && current_quote_age);
          if (ch == CHAR_EOF)
            /* Current_file changed to "" if we see CHAR_EOF, use



_______________________________________________
M4-discuss mailing list
M4-discuss@gnu.org
http://lists.gnu.org/mailman/listinfo/m4-discuss

Re: M4 1.4.10b [beta] released

Reply via email to