On 22/04/2024 08:22, Benjamin Berg wrote:
On Mon, 2024-04-22 at 10:51 +0800, Tiwei Bie wrote:
On 4/18/24 5:23 PM, benja...@sipsolutions.net wrote:
diff --git a/arch/um/include/asm/mmu.h b/arch/um/include/asm/mmu.h
index 37eb6e89e79a..bf8da736609c 100644
--- a/arch/um/include/asm/mmu.h
+++ b/arch/um/include/asm/mmu.h
@@ -10,6 +10,10 @@
 typedef struct mm_context {
        struct mm_id id;
+
+       /* Address range in need of a TLB sync */
+       long int sync_tlb_range_from;
+       long int sync_tlb_range_to;

Why not "unsigned long"?

Oops, yes, it should be "unsigned long".


  } mm_context_t;
 extern void __switch_mm(struct mm_id * mm_idp);
diff --git a/arch/um/include/asm/pgtable.h
b/arch/um/include/asm/pgtable.h
index e1ece21dbe3f..5bb397b65efb 100644
--- a/arch/um/include/asm/pgtable.h
+++ b/arch/um/include/asm/pgtable.h
@@ -244,6 +244,38 @@ static inline void set_pte(pte_t *pteptr,
pte_t pteval)
 #define PFN_PTE_SHIFT PAGE_SHIFT +static inline void um_tlb_mark_sync(struct mm_struct *mm, unsigned
long start,
+                                   unsigned long end)
+{
+       if (!mm->context.sync_tlb_range_to) {
+               mm->context.sync_tlb_range_from = start;
+               mm->context.sync_tlb_range_to = end;
+       } else {
+               if (start < mm->context.sync_tlb_range_from)
+                       mm->context.sync_tlb_range_from = start;
+               if (end > mm->context.sync_tlb_range_to)
+                       mm->context.sync_tlb_range_to = end;
+       }
+}

IIUC, in some cases, the range [sync_tlb_range_from, sync_tlb_range_to)
might become very large when merging non-adjacent ranges? Could that
be an issue?

I figured it is not a big problem. It will result in scanning the
entire page table once to check whether the NEW_PAGE bit is set on any
PTE. I am assuming that this will happen almost never and scanning the
page table (but not doing syscalls) is reasonably cheap at the end.

diff --git a/arch/um/include/asm/tlbflush.h b/arch/um/include/asm/tlbflush.h
index d7cf82023b74..62816f6f1c91 100644
--- a/arch/um/include/asm/tlbflush.h
+++ b/arch/um/include/asm/tlbflush.h
@@ -9,24 +9,50 @@
  #include <linux/mm.h>
 /*
- * TLB flushing:
+ * In UML, we need to sync the TLB over by using mmap/munmap/mprotect syscalls
+ * from the process handling the MM (which can be the kernel itself).
+ *
+ * To track updates, we can hook into set_ptes and flush_tlb_*. With set_ptes
+ * we catch all PTE transitions where memory that was unusable becomes usable.
+ * While with flush_tlb_* we can track any memory that becomes unusable and
+ * even if a higher layer of the page table was modified.
+ *
+ * So, we simply track updates using both methods and mark the memory area to
+ * be synced later on. The only special case is that flush_tlb_kern_* needs to
+ * be executed immediately as there is no good synchronization point in that
+ * case. In contrast, in the set_ptes case we can wait for the next kernel
+ * segfault before we do the synchornization.
   *
- *  - flush_tlb() flushes the current mm struct TLBs
   *  - flush_tlb_all() flushes all processes TLBs
   *  - flush_tlb_mm(mm) flushes the specified mm context TLB's
   *  - flush_tlb_page(vma, vmaddr) flushes one page
- *  - flush_tlb_kernel_vm() flushes the kernel vm area
   *  - flush_tlb_range(vma, start, end) flushes a range of pages
+ *  - flush_tlb_kernel_range(start, end) flushes a range of kernel pages
   */
+extern int um_tlb_sync(struct mm_struct *mm);
+
  extern void flush_tlb_all(void);
  extern void flush_tlb_mm(struct mm_struct *mm);
-extern void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
-                           unsigned long end);
-extern void flush_tlb_page(struct vm_area_struct *vma, unsigned long address);
-extern void flush_tlb_kernel_vm(void);
-extern void flush_tlb_kernel_range(unsigned long start, unsigned long end);
-extern void __flush_tlb_one(unsigned long addr);
+
+static void flush_tlb_page(struct vm_area_struct *vma, unsigned long address)
+{
+       um_tlb_mark_sync(vma->vm_mm, address, address + PAGE_SIZE);
+}
+
+static void flush_tlb_range(struct vm_area_struct *vma, unsigned long start,
+                           unsigned long end)
+{
+       um_tlb_mark_sync(vma->vm_mm, start, end);
+}
+
+static void flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+       um_tlb_mark_sync(&init_mm, start, end);
+
+       /* Kernel needs to be synced immediately */
+       um_tlb_sync(&init_mm);
+}

Nit: this is a header file, these functions should be defined as inline 
functions.

Yup, thanks!

diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
index c137ff6f84dd..232aa7601d5d 100644
--- a/arch/um/kernel/tlb.c
+++ b/arch/um/kernel/tlb.c
[...]
-void flush_tlb_kernel_range(unsigned long start, unsigned long
end)
-{
-       flush_tlb_kernel_range_common(start, end);
-}
-
-void flush_tlb_kernel_vm(void)
-{
-       flush_tlb_kernel_range_common(start_vm, end_vm);
-}

The build breaks with this change, as there is still a call to
flush_tlb_kernel_vm() in ubd.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/um/drivers/ubd_kern.c?id=fb5d1d389c9e78d68f1f71f926d6251017579f5b#n774

Oh, thanks for the pointer!

I do not see a good reason for that call to even exist. My best theory
right now is that it existed to avoid later pagefaults for new memory
regions (the vmalloc?). So a workaround that is not needed anymore with
this patch.

It is there since prehistoric times.

No idea why and what it's doing. IMHO it is not needed.


Benjamin





--
Anton R. Ivanov
https://www.kot-begemot.co.uk/

Reply via email to