On Fri, Nov 16, 2018 at 09:07:50 +0100, Richard Henderson wrote: > On 11/16/18 6:10 AM, Emilio G. Cota wrote: > > It's possible that newer machines with larger reorder buffers > > will be able to take better advantage of the higher instruction > > locality, hiding the latency of having to execute more instructions. > > I'll test on Skylake tomorrow. > > I've noticed that the code we generate for calls has twice as many > instructions > as really needed for setting up the arguments. I have a plan to fix that, > which hopefully will solve this problem.
Ah that's great. I'll do more tests and a full review when those changes come out, then. Thanks, E.