On 10 July 2017 at 16:17, Alex Bennée <alex.ben...@linaro.org> wrote: > > Peter Maydell <peter.mayd...@linaro.org> writes: > >> On 10 July 2017 at 15:28, Alex Bennée <alex.ben...@linaro.org> wrote: >>> While the SoftMMU is not emulating the target MMU of a system there is >>> a relationship between its page size and that of the target. If the >>> target MMU is full featured the functions called to re-fill the >>> entries in the SoftMMU entries start moving up the perf profiles. If >>> we can we should try and prevent too much thrashing around by having >>> the page sizes the same. >>> >>> Ideally we should use TARGET_PAGE_BITS_MIN but that potentially >>> involves a fair bit of #include re-jigging so I went for 10 bits (1k >>> pages) which I think is the smallest of all our emulated systems. >> >> The figures certainly show an improvement, but it's not clear >> to me why this is related to the target's page size rather than >> just being a "bigger is better" kind of thing? > > Well this was driven by a discussion with Pranith last week. In his > (admittedly memory intensive) bench-marking he was seeing around 30% > overhead is coming from mmu related functions with the hottest being > get_phys_addr_lpae() followed by address_space_do_translate(). We > theorised that even given the high hit rate of the fast path the slow > path was triggered by moving over SoftMMU's effective page boundary. A > quick experiment in extending the size of the TLB made his hot spots > disappear. > > I don't see quite such a hot-spot in my simple boot/build benchmark test > but after helper_lookup_tb_ptr quite a lot of hits are part of the > re-fill chain:
Right, but why do we know that the target page size matters rather than this just being "smaller TLB -> more TLB misses -> more calls to the slow path -> functions called in the slow path appear more in profiling" ? thanks -- PMM