Paolo Bonzini <pbonz...@redhat.com> writes: > Computing TranslationBlock flags is pretty expensive on ARM, especially > 32-bit. Because tbflags are computed on every tb lookup, it is not > unlikely to see cpu_get_tb_cpu_state close to the top of the profile > now that QHT makes the hash table much more efficient. > > However, most tbflags only change when the EL is switched or after > MSR instructions. Based on this observation, this series caches these > tbflags in CPUARMState, resulting in a 10-15% speedup on 32-bit code.
Hi, I'm starting to clear out my review queue but I notice these now longer apply cleanly to master. Where you going to re-issue the series once you'd addressed Peter's concerns? My general comments are I think this is a good idea but my concern is ensuring state changes get picked up and we don't end up with inconsistent state between real and cached values. I still have the scars from my last attempt to rationalise cpu.h pstate, aarch64, uncached_cpsr and spsr! Cheers, -- Alex Bennée