On 12/23/21 00:01, Richard Henderson wrote:
In contrast to Daniel's version, the code stays in power8-pmu.c, but is better organized to not take so much overhead. Before: 32.97% qemu-system-ppc qemu-system-ppc64 [.] pmc_get_event 20.22% qemu-system-ppc qemu-system-ppc64 [.] helper_insns_inc 4.52% qemu-system-ppc qemu-system-ppc64 [.] hreg_compute_hflags_value 3.30% qemu-system-ppc qemu-system-ppc64 [.] helper_lookup_tb_ptr 2.68% qemu-system-ppc qemu-system-ppc64 [.] tcg_gen_code 2.28% qemu-system-ppc qemu-system-ppc64 [.] cpu_exec 1.84% qemu-system-ppc qemu-system-ppc64 [.] pmu_insn_cnt_enabled After: 8.42% qemu-system-ppc qemu-system-ppc64 [.] hreg_compute_hflags_value 6.65% qemu-system-ppc qemu-system-ppc64 [.] cpu_exec 6.63% qemu-system-ppc qemu-system-ppc64 [.] helper_insns_inc
Thanks for looking this up. I had no idea the original C code was that slow. This reorg is breaking PMU-EBB tests, unfortunately. These tests are run from the kernel tree [1] and I test them inside a pSeries TCG guest. You'll need to apply patches 9 and 10 of [2] beforehand (they apply cleanly in current master) because they aren't upstream yet and EBB needs it. The tests that are breaking consistently with this reorg are: back_to_back_ebbs_test.c cpu_event_pinned_vs_ebb_test.c cycles_test.c task_event_pinned_vs_ebb_test.c The issue here is that these tests exercises different Perf events and aspects of branching (e.g. how fast we're detecting a counter overflow, how many times, etc) and I wasn't able to find out a fix using your C reorg yet. With that in mind I decided to post a new version of my TCG rework, with less repetition and a bit more concise, to have an alternative that can be used upstream to fix the Avocado tests. Meanwhile I'll see if I can get your reorg working with all EBB tests we need. All things equal - similar performance, all EBB tests passing - I'd rather stay with your C code than my TCG rework since yours doesn't rely on TCG Ops knowledge to maintain it. Thanks, Daniel [1] https://github.com/torvalds/linux/tree/master/tools/testing/selftests/powerpc/pmu/ebb [2] https://lists.gnu.org/archive/html/qemu-devel/2021-12/msg00073.html
r~ Richard Henderson (3): target/ppc: Cache per-pmc insn and cycle count settings target/ppc: Rewrite pmu_increment_insns target/ppc: Use env->pnc_cyc_cnt target/ppc/cpu.h | 3 + target/ppc/power8-pmu.h | 14 +-- target/ppc/cpu_init.c | 1 + target/ppc/helper_regs.c | 2 +- target/ppc/machine.c | 2 + target/ppc/power8-pmu.c | 230 ++++++++++++++++----------------------- 6 files changed, 108 insertions(+), 144 deletions(-)