On 24.09.19 00:59, Richard Henderson wrote: > There is only one caller, tlb_set_page_with_attrs. We cannot > inline the entire function because the AddressSpaceDispatch > structure is private to exec.c, and cannot easily be moved to > include/exec/memory-internal.h. > > Compute is_ram and is_romd once within tlb_set_page_with_attrs. > Fold the number of tests against these predicates. Compute > cpu_physical_memory_is_clean outside of the tlb lock region. > > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > include/exec/exec-all.h | 6 +--- > accel/tcg/cputlb.c | 68 ++++++++++++++++++++++++++--------------- > exec.c | 22 ++----------- > 3 files changed, 47 insertions(+), 49 deletions(-) > > diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h > index 81b02eb2fe..49db07ba0b 100644 > --- a/include/exec/exec-all.h > +++ b/include/exec/exec-all.h > @@ -509,11 +509,7 @@ address_space_translate_for_iotlb(CPUState *cpu, int > asidx, hwaddr addr, > hwaddr *xlat, hwaddr *plen, > MemTxAttrs attrs, int *prot); > hwaddr memory_region_section_get_iotlb(CPUState *cpu, > - MemoryRegionSection *section, > - target_ulong vaddr, > - hwaddr paddr, hwaddr xlat, > - int prot, > - target_ulong *address); > + MemoryRegionSection *section); > #endif > > /* vl.c */ > diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c > index 05212ff244..05530a8b0c 100644 > --- a/accel/tcg/cputlb.c > +++ b/accel/tcg/cputlb.c > @@ -704,13 +704,14 @@ void tlb_set_page_with_attrs(CPUState *cpu, > target_ulong vaddr, > MemoryRegionSection *section; > unsigned int index; > target_ulong address; > - target_ulong code_address; > + target_ulong write_address; > uintptr_t addend; > CPUTLBEntry *te, tn; > hwaddr iotlb, xlat, sz, paddr_page; > target_ulong vaddr_page; > int asidx = cpu_asidx_from_attrs(cpu, attrs); > int wp_flags; > + bool is_ram, is_romd; > > assert_cpu_is_self(cpu); > > @@ -739,18 +740,46 @@ void tlb_set_page_with_attrs(CPUState *cpu, > target_ulong vaddr, > if (attrs.byte_swap) { > address |= TLB_BSWAP; > } > - if (!memory_region_is_ram(section->mr) && > - !memory_region_is_romd(section->mr)) { > - /* IO memory case */ > - address |= TLB_MMIO; > - addend = 0; > - } else { > + > + is_ram = memory_region_is_ram(section->mr); > + is_romd = memory_region_is_romd(section->mr); > + > + if (is_ram || is_romd) { > + /* RAM and ROMD both have associated host memory. */ > addend = (uintptr_t)memory_region_get_ram_ptr(section->mr) + xlat; > + } else { > + /* I/O does not; force the host address to NULL. */ > + addend = 0; > + } > + > + write_address = address;
I guess the only "suboptimal" change is that you now have two checks for "prot & PAGE_WRITE" twice in the case of ram instead of one. > + if (is_ram) { > + iotlb = memory_region_get_ram_addr(section->mr) + xlat; > + /* > + * Computing is_clean is expensive; avoid all that unless > + * the page is actually writable. > + */ > + if (prot & PAGE_WRITE) { > + if (section->readonly) { > + write_address |= TLB_ROM; > + } else if (cpu_physical_memory_is_clean(iotlb)) { > + write_address |= TLB_NOTDIRTY; > + } > + } > + } else { > + /* I/O or ROMD */ > + iotlb = memory_region_section_get_iotlb(cpu, section) + xlat; > + /* > + * Writes to romd devices must go through MMIO to enable write. > + * Reads to romd devices go through the ram_ptr found above, > + * but of course reads to I/O must go through MMIO. > + */ > + write_address |= TLB_MMIO; ... and here you calculate write_address even if probably unused. Can your move the calculation of the write_address completely into the "prot & PAGE_WRITE" case below? (I'm not looking at the full code, so could as well be that I am missing something :) ) -- Thanks, David / dhildenb