On Sat, Oct 27, 2018 at 10:14:47 +0100, Alex Bennée wrote: > > Emilio G. Cota <c...@braap.org> writes: > > > [I forgot to add the cover letter to git send-email; here it is] > > > > v3: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg04179.html > > > > "Why is this an RFC?" See v3 link above. Also, see comment at > > the bottom of this message regarding the last patch of this series. > > I'm also seeing his hang on check-tcg, specifically qemu-aarch64 > ./tests/linux-test
Thanks for reporting. The last patch in the series is the one that causes the hang. I didn't test that patch much, since I did not intend to get it merged. Over the weekend I had a bit of time to think about an actual fix, i.e. how to reduce safe work calls for TLB invalidations. The idea is to check whether the remote invalidation is necessary at all; we can take the remote tlb's lock, and check whether the address we want to invalidate has been read by the remote CPU since its latest flush. On some quick tests booting an aarch64 system I measured that only up to ~2% of remote invalidations are actually necessary. I just did a search on google scholar and found a similar approach to reduce remote TLB shootdowns on ARM, this time for hardware. This paper "TLB Shootdown Mitigation for Low-Power Many-Core Servers with L1 Virtual Caches" https://dl.acm.org/citation.cfm?id=3202975 addresses the issue by employing bloom filters in hardware to determine whether an address has been accessed by a TLB before performing an invalidation (and the corresponding icache flush). In software, using a per-TLB hash table might be enough. I'll try to have something ready for v5. Thanks, Emilio