Nicholas Clark wrote: > I believe that your understanding of the JIT and the GC cores are still > correct. The problem would be solved if we had some nice way of getting the > C compiler to generate us nice stub versions of all the non-inline ops > functions, which we could then place inline. However, I suspect that part of > the speed of the CG core comes from the compiler (this is always gcc?) > being able to do away with the function call and function return overheads > between the ops it has inlined in the GC core.
You may want to check out the following two papers and their references: I. Piumarta, F. Riccardi. Optimizing direct threaded code by selective inlining. In ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), June 17-19, Montreal, Canada, 1998 ftp://ftp.inria.fr/INRIA/Projects/SOR/papers/1998/ODCSI_pldi98.ps.gz M. Anton Ertl, A Portable Forth Engine, Proceedings euroFORTH '93, pages 253-257. http://www.complang.tuwien.ac.at/forth/threaded-code.html Everything you ever wanted to know about optimising threaded interpreters, but were too afraid to ask. The "selecting inlining" method in particular talks about how to extract inline code blocks dynamically and then paste them together. Cheers, Rhys.