Hi, Kevin Ryde <[EMAIL PROTECTED]> writes:
> If that's true you'll have to start from the beginning again, > everyone's eyes glaze over at "1000 modules". Quoting the original message in this thread: An application of mine [1], although it modifies `the-scm-module' at run-time, requiring 40 modules to re-process duplicates, has its execution time reduced by 8% (on a run that loads around 100 modules). The whole test suite runs about 10% faster with the modified version (although it has a larger `modules.test'). So it seems to be beneficial performance-wise. I'd be happy if people could try it out with other applications (e.g., Lilypond ;-)) and measure the difference it makes. Note that the two measurements are on whole application execution time (none of them is long-running, though). So 8%-10% is pretty good given that the changes are expected to be beneficial only at initialization time, i.e., before all duplicates have been processed and all variables have been memoized. So, once again, if people could make similar measurements with other applications, that'd be really great. :-) >> From a performance viewpoint, the question is: how can we >> provide the fastest variable lookup and duplicate binding detection, >> i.e., using what data structures and algorithms? > > There's no need to think too hard about the data until the innermost > code dealing with them is in C. If I understood correctly, I think I disagree. Writing in C or assembly doesn't free from choosing appropriate algorithms. An algorithm that is not scalable is not scalable, whether it is implemented in C or not. > I guess the general problem is sets of names to be combined and looked > up with certain precedence. I can't see anything much to help that > without using quite a bit of extra memory (which of course is bad for > gc, and bad for the system generally because it's unshared). Yes, there may be a space-time tradeoff. Clearly, my patch requires more (heap-allocated) memory. > One possibility for duplicates would be lazy checking, only check for > a clash when actually using a symbol. That's sort of the prolog > theory: don't worry now about what might never come up. I suspect the > total work would end up greater though. So we'd perform duplicate checking as variables are looked up? Interesting idea. Then we'd completely remove `process-duplicates' and would have an O(USES) variable lookup. However, it may be quite R6RS-unfriendly since R6RS requires duplicates to be detected as errors, i.e., before code is actually run (Section 6.1 of R5.92RS). Anyway, maybe we should give it a try and compare both approaches? Thanks, Ludo'. _______________________________________________ Guile-devel mailing list Guile-devel@gnu.org http://lists.gnu.org/mailman/listinfo/guile-devel