( Sorry about the delay in answering this. I could blame the delay on the merge window, but in reality I've been procrastinating this is due to the permanent, non-trivial impact PIE has on generated C code. )
* Thomas Garnier <thgar...@google.com> wrote: > 1) PIE sometime needs two instructions to represent a single > instruction on mcmodel=kernel. What again is the typical frequency of this occurring in an x86-64 defconfig kernel, with the very latest GCC? Also, to make sure: which unwinder did you use for your measurements, frame-pointers or ORC? Please use ORC only for future numbers, as frame-pointers is obsolete from a performance measurement POV. > 2) GCC does not optimize switches in PIE in order to reduce relocations: Hopefully this can either be fixed in GCC or at least influenced via a compiler switch in the future. > The switches are the biggest increase on small functions but I don't > think they represent a large portion of the difference (number 1 is). Ok. > A side note, while testing gcc 7.2.0 on hackbench I have seen the PIE > kernel being faster by 1% across multiple runs (comparing 50 runs done > across 5 reboots twice). I don't think PIE is faster than a > mcmodel=kernel but recent versions of gcc makes them fairly similar. So I think we are down to an overhead range where the inherent noise (both random and systematic one) in 'hackbench' overwhelms the signal we are trying to measure. So I think it's the kernel .text size change that is the best noise-free proxy for the overhead impact of PIE. It doesn't hurt to double check actual real performance as well, just don't expect there to be much of a signal for anything but fully cached microbenchmark workloads. Thanks, Ingo _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel