2010/7/22 Jan Kiszka <jan.kis...@siemens.com>: > Stefan Hajnoczi wrote: >> On Thu, Jul 22, 2010 at 9:48 AM, Chen Yufei <cyfde...@gmail.com> wrote: >>> On 2010-7-22, at 上午1:04, Stefan Weil wrote: >>> >>>> Am 21.07.2010 09:03, schrieb Chen Yufei: >>>>> On 2010-7-21, at 上午5:43, Blue Swirl wrote: >>>>> >>>>> >>>>>> On Sat, Jul 17, 2010 at 10:27 AM, Chen Yufei<cyfde...@gmail.com> wrote: >>>>>> >>>>>>> We are pleased to announce COREMU, which is a "multicore-on-multicore" >>>>>>> full-system emulator built on Qemu. (Simply speaking, we made Qemu >>>>>>> parallel.) >>>>>>> >>>>>>> The project web page is located at: >>>>>>> http://ppi.fudan.edu.cn/coremu >>>>>>> >>>>>>> You can also download the source code, images for playing on sourceforge >>>>>>> http://sf.net/p/coremu >>>>>>> >>>>>>> COREMU is composed of >>>>>>> 1. a parallel emulation library >>>>>>> 2. a set of patches to qemu >>>>>>> (We worked on the master branch, commit >>>>>>> 54d7cf136f040713095cbc064f62d753bff6f9d2) >>>>>>> >>>>>>> It currently supports full-system emulation of x64 and ARM MPcore >>>>>>> platforms. >>>>>>> >>>>>>> By leveraging the underlying multicore resources, it can emulate up to >>>>>>> 255 cores running commodity operating systems (even on a 4-core >>>>>>> machine). >>>>>>> >>>>>>> Enjoy, >>>>>>> >>>>>> Nice work. Do you plan to submit the improvements back to upstream QEMU? >>>>>> >>>>> It would be great if we can submit our code to QEMU, but we do not know >>>>> the process. >>>>> Would you please give us some instructions? >>>>> >>>>> -- >>>>> Best regards, >>>>> Chen Yufei >>>>> >>>> Some hints can be found here: >>>> http://wiki.qemu.org/Contribute/StartHere >>>> >>>> Kind regards, >>>> Stefan Weil >>> The patch is in the attachment, produced with command >>> git diff 54d7cf136f040713095cbc064f62d753bff6f9d2 >>> >>> In order to separate what need to be done to make QEMU parallel, we created >>> a separate library, and the patched QEMU need to be compiled and linked >>> with that library. To submit our enhancement to QEMU, maybe we need to >>> incorporate this library into QEMU. I don't know what would be the best >>> solution. >>> >>> Our approach to make QEMU parallel can be found at >>> http://ppi.fudan.edu.cn/coremu >>> >>> I will give a short summary here: >>> >>> 1. Each emulated core thread runs a separate binary translator engine and >>> has private code cache. We marked some variables in TCG as thread local. We >>> also modified the TB invalidation mechanism. >>> >>> 2. Each core has a queue holding pending interrupts. The COREMU library >>> provides this queue, and interrupt notification is done by sending realtime >>> signals to the emulated core thread. >>> >>> 3. Atomic instruction emulation has to be modified for parallel emulation. >>> We use lightweight memory transaction which requires only compare-and-swap >>> instruction to emulate atomic instruction. >>> >>> 4. Some code in the original QEMU may cause data race bug after we make it >>> parallel. We fixed these problems. >>> >>> >>> >>> >>> -- >>> Best regards, >>> Chen Yufei >> >> Looking at the patch it seems there is a global lock for hardware >> access via coremu_spin_lock(&cm_hw_lock). How many cores have you >> tried running and do you have lock contention data for cm_hw_lock? > > BTW, this kind of lock is called qemu_global_mutex in QEMU, thus it is a > sleepy lock here which is likely better for the code paths protected by > it in upstream. Are they shorter in COREMU? > >> Have you thought about making hardware emulation concurrent? >> >> These are issues that qemu-kvm faces today since it executes vcpu >> threads in parallel. Both qemu-kvm and the COREMU patches could >> benefit from a solution for concurrent hardware access. > > While we are all looking forward to see more scalable hardware models > :), I think it is a topic that can be addressed widely independent of > parallelizing TCG VCPUs. The latter can benefit from the former, for > sure, but it first of all has to solve its own issues.
Right, but it's worth discussing with people who have worked on parallel vcpus from a different angle. > Note that --enable-io-thread provides truly parallel KVM VCPUs also in > upstream these days. Just for TCG, we need that sightly suboptimal CPU > scheduling inside single-threaded tcg_cpu_exec (was renamed to > cpu_exec_all today). > > Jan > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux >