On 10/4/24 06:52, Alex Bennée wrote:
We try to avoid using cpu_loop_exit_atomic as it brings in an all-core
sync point. However on some cpu/kernel/benchmark combinations it is
starting to show up in the performance profile. To make it easier to
see whats going on add tracepoints for the slow path so we can see
what is triggering the wait.

It seems for a modern CPU it can be quite a bit, for example:

./qemu-system-aarch64 \
            -machine 
type=virt,virtualization=on,pflash0=rom,pflash1=efivars,gic-version=max \
            -smp 4 \
            -accel tcg \
            -device virtio-net-pci,netdev=unet \
            -device virtio-scsi-pci \
            -device scsi-hd,drive=hd \
            -netdev user,id=unet,hostfwd=tcp::2222-:22 \
            -blockdev 
driver=raw,node-name=hd,file.driver=host_device,file.filename=/dev/zen-ssd2/trixie-arm64,discard=unmap
 \
            -serialmon:stdio \
            -blockdev 
node-name=rom,driver=file,filename=(pwd)/pc-bios/edk2-aarch64-code.fd,read-only=true
 \
            -blockdev 
node-name=efivars,driver=file,filename=$HOME/images/qemu-arm64-efivars \
            -m 8192 \
            -object memory-backend-memfd,id=mem,size=8G,share=on \
            -kernel /home/alex/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image -append 
"root=/dev/sda2 console=ttyAMA0 systemd.unit=benchmark-stress-ng.service" \
            -display none 
-dtrace:load_atom\*_fallback,trace:store_atom\*_fallback

With:

   -cpu neoverse-v1,pauth-impdef=on => 2203343

With:

   -cpu cortex-a76 => 0

Signed-off-by: Alex Bennée<[email protected]>
Cc: Pierrick Bouvier<[email protected]>
---
  accel/tcg/user-exec.c          |  2 +-
  accel/tcg/ldst_atomicity.c.inc |  9 +++++++++
  accel/tcg/trace-events         | 12 ++++++++++++
  3 files changed, 22 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <[email protected]>

r~

Reply via email to