Emilio G. Cota <c...@braap.org> writes:
> On Sat, Oct 27, 2018 at 10:14:47 +0100, Alex Bennée wrote: >> >> Emilio G. Cota <c...@braap.org> writes: >> >> > [I forgot to add the cover letter to git send-email; here it is] >> > >> > v3: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg04179.html >> > >> > "Why is this an RFC?" See v3 link above. Also, see comment at >> > the bottom of this message regarding the last patch of this series. >> >> I'm also seeing his hang on check-tcg, specifically qemu-aarch64 >> ./tests/linux-test > > Thanks for reporting. The last patch in the series is the one > that causes the hang. I didn't test that patch much, since I > did not intend to get it merged. See patch 3, I think it's just because the per-cpu locks aren't available in linux-user, breaking the exclusive mechanism. > Over the weekend I had a bit of time to think about an actual fix, i.e. > how to reduce safe work calls for TLB invalidations. The idea is to check > whether the remote invalidation is necessary at all; we can take the remote > tlb's lock, and check whether the address we want to invalidate has been > read by the remote CPU since its latest flush. On some quick tests > booting an aarch64 system I measured that only up to ~2% of remote > invalidations are actually necessary. > > I just did a search on google scholar and found a similar approach > to reduce remote TLB shootdowns on ARM, this time for hardware. > This paper > "TLB Shootdown Mitigation for Low-Power Many-Core Servers with > L1 Virtual Caches" > https://dl.acm.org/citation.cfm?id=3202975 > addresses the issue by employing bloom filters in hardware to > determine whether an address has been accessed by a TLB before > performing an invalidation (and the corresponding icache flush). > > In software, using a per-TLB hash table might be enough. I'll try > to have something ready for v5. OK. -- Alex Bennée