On 10/9/24 17:35, Pierrick Bouvier wrote:
@@ -1061,15 +1061,13 @@ static inline void tlb_set_compare(CPUTLBEntryFull *full,
CPUTLBEntry *ent,
vaddr address, int flags,
MMUAccessType access_type, bool enable)
{
- if (enable) {
- address |= flags & TLB_FLAGS_MASK;
- flags &= TLB_SLOW_FLAGS_MASK;
- if (flags) {
- address |= TLB_FORCE_SLOW;
- }
- } else {
- address = -1;
- flags = 0;
+ if (!enable) {
+ address = TLB_INVALID_MASK;
+ }
+ address |= flags & TLB_FLAGS_MASK;
+ flags &= TLB_SLOW_FLAGS_MASK;
+ if (flags) {
+ address |= TLB_FORCE_SLOW;
}
I'm not sure to follow this change correctly.
After, the final address and flags value depend on flags in parameter, while before, it
used to depend on flags & enable parameter.
This deserves to be split out; I even thought about it Monday night but then forgot when I
restarted on Tuesday morning.
Before, address is -1 for disabled, mostly because that mirrors what you get with memset
to initialize the tlb. All of the flags are discarded. But the only thing that's
important is that TLB_INVALID_MASK is set.
After, TLB_INVALID_MASK is still set, but the flags are retained. This is because we want
a source of those flags to use for MMU_INST_FETCH. With this patch set we no longer store
flags for execute and instead grab them from the flags for read. From tlb_set_page_full...
@@ -1215,9 +1213,6 @@ void tlb_set_page_full(CPUState *cpu, int mmu_idx,
/* Now calculate the new entry */
node->copy.addend = addend - addr_page;
- tlb_set_compare(full, &node->copy, addr_page, read_flags,
- MMU_INST_FETCH, prot & PAGE_EXEC);
-
if (wp_flags & BP_MEM_READ) {
read_flags |= TLB_WATCHPOINT;
}
... we can see that the only difference between the two is the watchpoint bit.
Importantly, TLB_MMIO is common to all three comparators.
Sounds good to have a fast path for code fetch. Did you measure the benefit, or just
implemented this thinking it's worth?
This is not about a fast path for code fetch, but always using the *slow* path. The
object is to repurpose one word from CPUTLBEntry, removed here and added back in the next
patch, to link CPUTLBEntry to CPUTLBEntryTree without changing sizeof(CPUTLBEntry).
r~