пт, 11 янв. 2019 г. в 22:24, Matwey V. Kornilov <matwey.korni...@gmail.com>: > > пт, 11 янв. 2019 г. в 12:52, Peter Maydell <peter.mayd...@linaro.org>: > > > > On Thu, 10 Jan 2019 at 19:33, Matwey V. Kornilov > > <matwey.korni...@gmail.com> wrote: > > > I am running the same application compiled for aarch64 and armv7l on > > > x86_64 platform using qemu-user-linux tools. > > > > > > I see dramatic performance difference (30 times) between emulated > > > architectures: aarch64 runs for ~4 minutes, armv7l runs for ~2 hours. > > > I do understand that CPU architecture emulation is inherently slow > > > thing, but my question is about the difference. > > > > > > How could I debug to understand what is the reason for such a big > > > difference? I've already tried to run stress-ng compiled for this two > > > architectures, but it leads to the same performance per second. > > > > > > I am running qemu 2.11, should I try other version? > > > > Yes, do try 3.1 -- we have done some overall TCG performance > > improvements. > > Indeed, qemu-arm from master runs for 4 minutes where 2.11 runs for 2 > hours for me. It is impressive improvement.
I've managed to bisected the first good (fast) commit: commit 2a53535af471f4bee9d6cb5b363746b8d5ed21dd Author: Luke Shumaker <luke...@parabola.nu> Date: Thu Dec 28 13:08:13 2017 -0500 linux-user: init_guest_space: Try to make ARM space+commpage continuous Though I am not sure, how does it help. > > > > > For a big difference between target architectures like that, > > I would try starting by using some host performance tools on > > the two runs to see where all the time is being taken in > > the armv7l guest run -- is it all in translated guest code, > > or is there more time (proportionally) spent in particular > > parts of the QEMU C code? Does the armv7l version do > > many more or different syscalls (check with the QEMU -strace > > option) ? > > > > Also you should check performance on h/w 32 bit vs > > 64-bit Arm if you can, to confirm that it's not just > > that the guest application runs much slower there. > > (If you don't have the arm hardware you could at least > > check x86 32-bit vs 64-bit.) > > > > thanks > > -- PMM > > > > -- > With best regards, > Matwey V. Kornilov -- With best regards, Matwey V. Kornilov