Re: [RFC PATCH] accel/tcg: clear all TBs from a page when it is written to

Richard Henderson Sun, 11 Aug 2024 18:27:25 -0700

On 8/9/24 17:47, Nicholas Piggin wrote:

This is not a clean patch, but does fix a problem I hit with TB
invalidation due to the target software writing to memory with TBs.


Lockup messages are triggering in Linux due to page clearing taking a
long time when a code page has been freed, because it takes a lot of
notdirty notifiers, which massively slows things down. Linux might
possibly have a bug here too because it seems to hang indefinitely in
some cases, but even if it didn't, the latency of clearing these pages
is very high.

This showed when running KVM on the emulated machine, starting and
stopping guests. That causes lots of instruction pages to be freed.
Usually if you're just running Linux, executable pages remain in
pagecache so you get fewer of these bombs in the kernel memory
allocator. But page reclaim, JITs, deleting executable files, etc.,
could trigger it too.

Invalidating all TBs from the page on any hit seems to avoid the problem
and generally speeds things up.

How important is the precise invalidation? These days I assume the
tricky kind of SMC that frequently writes code close to where it's
executing is pretty rare and might not be something we really care about
for performance. Could we remove sub-page TB invalidation entirely?


Happens on x86 and s390 regularly enough, so we can't remove it.

@@ -1107,6 +1107,9 @@ tb_invalidate_phys_page_range__locked(struct 
page_collection *pages,
      TranslationBlock *current_tb = retaddr ? tcg_tb_lookup(retaddr) : NULL;
  #endif /* TARGET_HAS_PRECISE_SMC */

+ start &= TARGET_PAGE_MASK;

+    last |= ~TARGET_PAGE_MASK;
+
      /* Range may not cross a page. */
      tcg_debug_assert(((start ^ last) & TARGET_PAGE_MASK) == 0);


This would definitely break SMC.

However, there's a better solution. We're already iterating over all of the TBs on thecurrent page only. Move *everything* except the tb_phys_invalidate__locked call into theSMC ifdef, and unconditionally invalidate every TB selected in the loop.

We experimented with something like this for aarch64, which used to spend a lot of thekernel startup time invalidating code pages from the (somewhat bloated) EDK2 bios. But itturned out the bigger problem was address space randomization, and with CF_PCREL theproblem appeared to go away.

I don't think we've done any kvm-under-tcg performance testing, but lockup messages wouldcertainly be something to look for...

r~

Re: [RFC PATCH] accel/tcg: clear all TBs from a page when it is written to

Reply via email to