Cc: Aleksandar Markovic <amarko...@wavecomp.com> Cc: Alexander Graf <ag...@suse.de> Cc: Alistair Francis <alistai...@gmail.com> Cc: Andrzej Zaborowski <balr...@gmail.com> Cc: Anthony Green <gr...@moxielogic.com> Cc: Artyom Tarasenko <atar4q...@gmail.com> Cc: Aurelien Jarno <aurel...@aurel32.net> Cc: Bastian Koppelmann <kbast...@mail.uni-paderborn.de> Cc: Christian Borntraeger <borntrae...@de.ibm.com> Cc: Chris Wulff <crwu...@gmail.com> Cc: Cornelia Huck <coh...@redhat.com> Cc: David Gibson <da...@gibson.dropbear.id.au> Cc: David Hildenbrand <da...@redhat.com> Cc: "Edgar E. Iglesias" <edgar.igles...@gmail.com> Cc: Eduardo Habkost <ehabk...@redhat.com> Cc: Fabien Chouteau <chout...@adacore.com> Cc: Guan Xuetao <g...@mprc.pku.edu.cn> Cc: James Hogan <jho...@kernel.org> Cc: Laurent Vivier <laur...@vivier.eu> Cc: Marek Vasut <ma...@denx.de> Cc: Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk> Cc: Max Filippov <jcmvb...@gmail.com> Cc: Michael Clark <m...@sifive.com> Cc: Michael Walle <mich...@walle.cc> Cc: Palmer Dabbelt <pal...@sifive.com> Cc: Pavel Dovgalyuk <dovga...@ispras.ru> Cc: Peter Crosthwaite <crosthwaite.pe...@gmail.com> Cc: Peter Maydell <peter.mayd...@linaro.org> Cc: qemu-...@nongnu.org Cc: qemu-...@nongnu.org Cc: qemu-s3...@nongnu.org Cc: Richard Henderson <r...@twiddle.net> Cc: Sagar Karandikar <sag...@eecs.berkeley.edu> Cc: Stafford Horne <sho...@gmail.com>
I'm calling this series a v3 because it supersedes the two series I previously sent about using atomics for interrupt_request: https://lists.gnu.org/archive/html/qemu-devel/2018-09/msg02013.html The approach in that series cannot work reliably; using (locked) atomics to set interrupt_request but not using (locked) atomics to read it can lead to missed updates. This series takes a different approach: it serializes access to many CPUState fields, including .interrupt_request, with a per-CPU lock. Protecting more fields of CPUState with the lock then allows us to substitute the BQL for the per-CPU lock in many places, notably the execution loop in cpus.c. This leads to better scalability for MTTCG, since CPUs don't have to acquire a contended lock (the BQL) every time they stop executing code. Some hurdles that remain: 1. I am not happy with the shutdown path via pause_all_vcpus. What happens if (a) A CPU is added while we're calling pause_all_vcpus? (b) Some CPUs are trying to run exclusive work while we call pause_all_vcpus? Am I being overly paranoid here? 2. I have done very light testing with x86_64 KVM, and no testing with other accels (hvf, hax, whpx). check-qtest works, except for an s390x test that to me is broken in master -- I reported the problem here: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03728.html 3. This might break record-replay. Furthermore, a quick test with icount on aarch64 seems to work, but I haven't tested icount extensively. 4. Some architectures still need the BQL in cpu_has_work. This leads to some contortions to avoid deadlock, since in this series cpu_has_work is called with the CPU lock held. 5. The interrupt handling path remains with the BQL held, mostly because the ISAs I routinely work with need the BQL anyway when handling the interrupt. We can complete the pushdown of the BQL to .do_interrupt/.exec_interrupt later on; this series is already way too long. Points (1)-(3) makes this series an RFC and not a proper patch series. I'd appreciate feedback on this approach and/or testing. Note that this series fixes a bug by which cpu_has_work is called without the BQL (from cpu_handle_halt). After this series, cpu_has_work is called with the CPU lock, and only the targets that need the BQL in cpu_has_work acquire it. For some performance numbers, see the last patch. The series is checkpatch-clean; only one warning due to the use of __COVERITY__ in cpus.c. You can fetch this series from: https://github.com/cota/qemu/tree/cpu-lock-v3 Note that it applies on top of tcg-next + my dynamic TLB series, which I'm using in the faint hope that the ubuntu experiments might run a bit faster. Thanks! Emilio