Hello, An update on this…
l...@gnu.org (Ludovic Courtès) skribis: > This is a followup to the compiler issue, this time focusing on > execution time, trying to see why the compiler takes so long to compile > gnu/packages/*.scm, which should be trivial. > > As a test bed, I built this from the Guix tree: > > GUILE_LOAD_COMPILED_PATH=$PWD guild compile \ > -L $PWD -o t.cps --to=cps gnu/packages/python.scm > > This results in a 428,185-line file, with that many CPS labels. [...] > The intmap ‘big-cps’ has those 428K elements distributed in 13,814 > 33-element vectors (per ‘intmap-vector-count’ in the attached file.) > > I fail to see why we’re GC’ing so much (‘visit-branch’ is also first in > the GC profile). To get a clearer idea of where memory consumption comes from, I added ‘format’ calls at several points in ‘emit-bytecode’ and its callees, while compiling again the big CPS above. ‘emit-bytecode’ starts by computing reachable functions. The first of these is the top-level thunk, which encompasses everything (all 428K labels.) On the first ‘allocate-slots’ call, memory consumption grows from ~500M to 1300M; most of this comes from ‘compute-live-variables’, which is also responsible for a large chunk of the run time. In short, compiling the top-level thunk is what’s killing us, because the space and time complexity is proportional to the number of labels therein. Also, our 16K line python.scm file translates into 428K labels, which during slot allocation translates into a dozen of 428K-element intmaps and intsets. So the compilation cost is space per source line of code is high. Andy, what are your thoughts? TIA, Ludo’.