On 10/11/24 07:47, Richard Henderson wrote:
On 10/9/24 17:35, Pierrick Bouvier wrote:
@@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull 
*full,
CPUTLBEntry *ent,
                                      vaddr address, int flags,
                                      MMUAccessType access_type, bool enable)
   {
-    if (enable) {
-        address |= flags & TLB_FLAGS_MASK;
-        flags &= TLB_SLOW_FLAGS_MASK;
-        if (flags) {
-            address |= TLB_FORCE_SLOW;
-        }
-    } else {
-        address = -1;
-        flags = 0;
+    if (!enable) {
+    address = TLB_INVALID_MASK;
+    }
+    address |= flags & TLB_FLAGS_MASK;
+    flags &= TLB_SLOW_FLAGS_MASK;
+    if (flags) {
+        address |= TLB_FORCE_SLOW;
       }

I'm not sure to follow this change correctly.
After, the final address and flags value depend on flags in parameter, while 
before, it
used to depend on flags & enable parameter.

This deserves to be split out; I even thought about it Monday night but then 
forgot when I
restarted on Tuesday morning.


Agree.

Before, address is -1 for disabled, mostly because that mirrors what you get 
with memset
to initialize the tlb.  All of the flags are discarded.  But the only thing 
that's
important is that TLB_INVALID_MASK is set.
> After, TLB_INVALID_MASK is still set, but the flags are retained.
This is because we want
a source of those flags to use for MMU_INST_FETCH.  With this patch set we no 
longer store
flags for execute and instead grab them from the flags for read.  From 
tlb_set_page_full...


That's subtle. When setting to TLB_INVALID_MASK, it seems to imply we won't check any flag on this entry. Maybe just a comment added here would clarify this.


@@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
       /* Now calculate the new entry */
       node->copy.addend = addend - addr_page;
-    tlb_set_compare(full, &node->copy, addr_page, read_flags,
-                    MMU_INST_FETCH, prot & PAGE_EXEC);
-
       if (wp_flags & BP_MEM_READ) {
           read_flags |= TLB_WATCHPOINT;
       }

... we can see that the only difference between the two is the watchpoint bit.
Importantly, TLB_MMIO is common to all three comparators.

Sounds good to have a fast path for code fetch. Did you measure the benefit, or 
just
implemented this thinking it's worth?

This is not about a fast path for code fetch, but always using the *slow* path. 
 The
object is to repurpose one word from CPUTLBEntry, removed here and added back 
in the next
patch, to link CPUTLBEntry to CPUTLBEntryTree without changing 
sizeof(CPUTLBEntry).


I missed the fact that MMU_INST_FETCH was used only when fetching code at translation time.


r~

Reply via email to