On Fri, Nov 16, 2018 at 14:55:01 +0100, Etienne Dublé wrote: (snip) > So the idea is: what if we could share the cache of code already translated > between all those processes? > There would be sereral ways to achieve this: > * use a shared memory area for the cache, and locking mechanisms. > * have a (maybe optional) daemon that would manage the cache of all > processes. > * python-like model: the first time a binary or library is translated, save > this translated code in a cache file next to the original file, with > different extension. > Please let me know what you think about it, if something similar has already > been studied, or if I miss something obvious.
There's a recent paper that implements something similar to what you propose: "A General Persistent Code Caching Framework for Dynamic Binary Translation (DBT)", ATC'16 https://www.usenix.org/system/files/conference/atc16/atc16_paper-wang.pdf Note that in that paper they compare against HQEMU, and not against upstream QEMU. I presume they chose HQEMU because it spends more effort than QEMU in trying to generate better code for hot code paths (they use LLVM in a separate thread for those), which means that code generation can be a bottleneck for some workloads (e.g. SPEC's gcc or perlbench). QEMU, on the other hand, generates much simpler code, and as a result it is rare to find workloads where code generation is a bottleneck. (You can measure this with perf top in your system; make sure you configured QEMU with --disable-strip to keep the symbols after "make install".) So until QEMU gets some sort of "hot code optimization" that makes translation more expensive, there's little point in implementing persistent code caching for it. As an aside, what QEMU version are you running? Performance has improved quite a bit (particularly for integer workloads) in the last couple of years, e.g. see the perf improvements from v2.6 to v2.11 here: https://imgur.com/a/5P5zj Cheers, Emilio