Hi all, Our team are working on a project similar to llvm-qemu [1], which is also based on QEMU and LLVM. Current status is the process mode works fine [2], and we're moving forward to system mode.
Let me briefly introduce our framework here and state what problem we encounter. What we do is translating TCG IR into LLVM IR and let LLVM JIT do the codegen. In our framework, we have both TCG and LLVM codegen capacity. For short-running application, TCG's code quality is good enough; LLVM codegen is for long-running application on the other hand. We have two code cache in our framework, one is the original QEMU code cache (for basic block) and the other is LLVM code cache (for trace). The concept of trace is the same as the "super-blocks" as mentioned in the discussion thread [3], which is composed of a set of basic blocks. The goal is to enlarge the optimization scope and hope the code quality of trace is better than the basic block's. Here is the overview of our framework. QEMU code cache LLVM code cache (block) (trace) bb1 ------------> trace1 In our framework, if we find a basic block (bb1) is hot enough (i.e., being executed many times), we start building a trace (beginning with bb1) and let LLVM do the codegen. We place the optimized code in the LLVM code cache, and patch the head of bb1 so that anyone executing bb1 will jump to trace1 directly. Since we're moving toward system mode, we have to consider situations where unlinking is needed. Block linking done by QEMU itself and we leave block unlinking to it. The problem is when/where to break the link between block and trace. I can only spot two places we should break the block -> trace link so far [4]. I don't know if I spot them all or I miss something else. 1. cpu_unlink_tb (exec.c) 2. tb_phys_invalidate (exec.c) The big problem is debugging. We test our system by using images downloaded from the website [5]. Basically, we want to see an operating system being booted successfully, then login and run some benchmark on it. As a very first step, we make a very high threshold on trace building. In other words, a basic block must be executed *many* time to trigger the trace building process. Then we lower the threshold a bit at a time to see how things work. When something goes wrong, we might get kernel panic or the system hangs at some point on the booting process. I have no idea on how to solve this kind of problem. So I'd like to seek for help/experience/suggestion on the mailing list. I just hope I make the whole situation clear to you. Thanks! [1] http://code.google.com/p/llvm-qemu/ [2] I have to admit we only test our framework with SPEC2006 benchmark, not with _real_ applications. [3] http://lists.cs.uiuc.edu/pipermail/llvmdev/2008-April/013689.html [4] http://lists.gnu.org/archive/html/qemu-devel/2011-09/msg03643.html [5] http://wiki.qemu.org/Download Regards, chenwj -- Wei-Ren Chen (陳韋任) Computer Systems Lab, Institute of Information Science, Academia Sinica, Taiwan (R.O.C.) Tel:886-2-2788-3799 #1667 Homepage: http://people.cs.nctu.edu.tw/~chenwj