On Nov 14, 2014, at 8:38 AM, Jeff Law <l...@redhat.com> wrote: > On 10/20/14 22:06, Maxim Kuvyrkov wrote: >> Hi, >> Ramana, this change requires benchmarking, which I can't easily do >> at > the moment. I would appreciate any benchmarking results that you can > share. In particular, the value of PARAM_SCHED_AUTOPREF_QUEUE_DEPTH > needs to be tuned/confirmed for Cortex-A15. > What were the results of that benchmarking? IIRC I tabled reviewing this > work waiting for those results (and I probably should have let you know that. > Sorry, my bad there).
I don't have the benchmarking results yet, and I was hoping for ARM to help with getting the numbers. The arm maintainers still need to OK the arm-specific portion of the patch, which, I imagine, will happen only of benchmark scores improve. ... > Can this be built on top of Bin's work for insn fusion? There's a lot of > commonality in the structure of the insns you care about. He's already got a > nice little priority function that I think you could utilize to to ensure the > insns with smaller offsets fire first. I would argue that macro-fusion should have been implemented the way autopref_model is -- via targetm.sched.first_cycle_multipass_dfa_lookahead_guard hook. To implement the autopref model I cleaned up and generalized existing infrastructure (max_issue and dfa_lookahead_guard hook) instead of adding yet another decision-making primitive to the scheduler. > > > My biggest concern would be sched2 coming along and undoing that work since > you're not going to fuse those into move-multiple types of instructions. The autoprefetcher will be active only during sched2. It is disabled during sched1 by the fact that max_issue is not used when scheduling for register pressure. Thanks, -- Maxim Kuvyrkov www.linaro.org