Hello Bruno, * Bruno Haible wrote on Mon, Jan 18, 2010 at 01:41:57AM CET: > In particular, patch 1 and 2 each removed many blank lines from the output. > One needs to verify whether this is harmless.
I did verify that for a number of test cases (one module, several modules, all modules, for different modules). More precisely, I verified that the test dirs have no differences when compared with diff -ruB -x autom4te.cache old new > Regarding part 1, I would like to put in a command-line option --no-cache > that preserves the old, slow but simpler code. It is good practice, for > every nontrivial optimization, to have a way to disable the optimization. > This helps in two situations: > - When a bug crops up, and you want to know whether the cache introduced it. This is no different from when a bug crops up in any other code: we debug it, by bisecting the git history or by inspecting the code, adding printf statements or using sh -x or the bash debugger. Having two code paths for the same functionality in practice often means that the lesser-used will bit-rot over time. If you are not confident enough in the new code, then please point out which parts you think are problematic. I would prefer introducing (automated) testing for gnulib-tool itself to ensure that it works as desired, to adding dead code. But that said, I wrote a patch that adds back the old code non-caching paths. Another issue I thought of was cache variable name collision. I'm posting a patch to error out upon such a collision (it can easily be avoided by changing the name -> cachevar mapping, should the need arise). > - When in the future, the optimization turns out to be a blocker for new > developments that were not known at this time. We can deal with unknown future problems when we get to them; thanks to version control we can easily see what needs to be reverted, should any changes, not just optimizations, turn out to be wrong turns. This patch series isn't premature optimization: the gnulib-tool code has been mostly stable for quite some time already, and it's not like I wouldn't profile and test what I'm optimizing. > Regarding part 1 also, I would like to investigate the possible speedup that > use of the new associative arrays of bash 4 can bring. See > <http://www.gnu.org/software/bash/manual/html_node/Arrays.html> > <http://tldp.org/LDP/abs/html/bashver4.html> Sounds like a good idea to me, if you ensure that older shells won't parse-error out on such constructs. I don't expect that much improvement over bash3 however, the only difference wrt. my patch series would be some less eval'ed code (and bash's parsing is slow but not *that* slow). The big benefit in this series comes from an algorithmic improvement. > > IRIX 6.5 sed has too many problems for > > gnulib-tool both without and with this patch series, but using GNU sed > > there seems to work fine. > > Can you put this info into the doc somewhere? I'll write a patch. Cheers, Ralf