Re: [PATCH 0/5] speed up gnulib-tool some more

Ralf Wildenhues Mon, 18 Jan 2010 13:44:02 -0800

Hello Bruno,

* Bruno Haible wrote on Mon, Jan 18, 2010 at 01:41:57AM CET:
> In particular, patch 1 and 2 each removed many blank lines from the output.
> One needs to verify whether this is harmless.


I did verify that for a number of test cases (one module, several
modules, all modules, for different modules).  More precisely, I
verified that the test dirs have no differences when compared with
  diff -ruB -x autom4te.cache old new

> Regarding part 1, I would like to put in a command-line option --no-cache
> that preserves the old, slow but simpler code. It is good practice, for
> every nontrivial optimization, to have a way to disable the optimization.
> This helps in two situations:
>   - When a bug crops up, and you want to know whether the cache introduced it.

This is no different from when a bug crops up in any other code: we
debug it, by bisecting the git history or by inspecting the code, adding
printf statements or using sh -x or the bash debugger.

Having two code paths for the same functionality in practice often means
that the lesser-used will bit-rot over time.  If you are not confident
enough in the new code, then please point out which parts you think are
problematic.  I would prefer introducing (automated) testing for
gnulib-tool itself to ensure that it works as desired, to adding dead
code.

But that said, I wrote a patch that adds back the old code non-caching
paths.

Another issue I thought of was cache variable name collision.  I'm
posting a patch to error out upon such a collision (it can easily be
avoided by changing the name -> cachevar mapping, should the need
arise).

>   - When in the future, the optimization turns out to be a blocker for new
>     developments that were not known at this time.

We can deal with unknown future problems when we get to them; thanks to
version control we can easily see what needs to be reverted, should any
changes, not just optimizations, turn out to be wrong turns.

This patch series isn't premature optimization: the gnulib-tool code has
been mostly stable for quite some time already, and it's not like I
wouldn't profile and test what I'm optimizing.

> Regarding part 1 also, I would like to investigate the possible speedup that
> use of the new associative arrays of bash 4 can bring. See
>   <http://www.gnu.org/software/bash/manual/html_node/Arrays.html>
>   <http://tldp.org/LDP/abs/html/bashver4.html>

Sounds like a good idea to me, if you ensure that older shells won't
parse-error out on such constructs.  I don't expect that much
improvement over bash3 however, the only difference wrt. my patch series
would be some less eval'ed code (and bash's parsing is slow but not
*that* slow).  The big benefit in this series comes from an algorithmic
improvement.

> > IRIX 6.5 sed has too many problems for 
> > gnulib-tool both without and with this patch series, but using GNU sed
> > there seems to work fine.
> 
> Can you put this info into the doc somewhere?

I'll write a patch.

Cheers,
Ralf

Re: [PATCH 0/5] speed up gnulib-tool some more

Reply via email to