On Wed 25 Oct 2017 19:42, l...@gnu.org (Ludovic Courtès) writes: > In short, compiling the top-level thunk is what’s killing us, because > the space and time complexity is proportional to the number of labels > therein. > > Also, our 16K line python.scm file translates into 428K labels, which > during slot allocation translates into a dozen of 428K-element intmaps > and intsets. So the compilation cost is space per source line of code > is high. > > Andy, what are your thoughts?
I think that probably these 428K labels are partitioned into a number of functions. It is true that the biggest one takes the most time. Should we attempt to speed this up, or should we try harder to simplify this graph even at low optimization levels (e.g. "simplify" pass), or should we avoid CSE and periodically split liveness ranges, or should we use a different register allocation strategy on large functions? Or would a transversal solution like a JIT actually be the solution? Or does the stack marking show the same pessimal behavior as the weak table, in that it's a large object behind a mark function? I do not know. I have tried many of the tests that you have done but I don't have a conclusive answer. Andy