On Thu, Sep 5, 2019 at 10:53 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Thu, Sep 5, 2019 at 7:47 AM Hongtao Liu <crazy...@gmail.com> wrote: > > > > Change cost from 2->6 got > > ------------- > > 531.deepsjeng_r 9.64% > > 548.exchange_r 10.24% > > 557.xc_r 7.99% > > 508.namd_r 1.08% > > 527.cam4_r 6.91% > > 553.nab_r 3.06% > > ------------ > > > > for 531,548,557,527, even better comparing to version before regression. > > for 508,533, still little regressions comparing to version before > > regression. > > Good, that brings us into "noise" region. > > Based on these results and other findings, I propose the following solution: > > - The inter-regset move costs of architectures, that have been defined > before r125951 remain the same. These are: size, i386, i486, pentium, > pentiumpro, geode, k6, athlon, k8, amdfam10, pentium4 and nocona. > - bdver, btver1 and btver2 have costs higher than 8, so they are not affected. > - lakemont, znver1, znver2, atom, slm, intel and generic costs have > inter-regset costs above intra-regset and below or equal memory > load/store cost, should remain as they are. Additionally, intel and > generic costs are regularly re-tuned. > - only skylake and core costs remain problematic > > So, I propose to raise XMM<->intreg costs of skylake and core > architectures to 6 to solve the regression. These can be fine-tuned > later, we are now able to change the cost for RA independently of RTX > costs. Also, the RA cost can be asymmetrical. > > Attached patch implements the proposal. If there are no other > proposals or discussions, I plan to commit it on Friday.
2019-09-06 Uroš Bizjak <ubiz...@gmail.com> PR target/91654 * config/i386/x86-tune-costs.h (skylake_cost): Raise the cost of SSE->integer and integer->SSE moves from 2 to 6. (core_cost): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros.
diff --git a/gcc/config/i386/x86-tune-costs.h b/gcc/config/i386/x86-tune-costs.h index 3381b8bf143c..00edece3eb68 100644 --- a/gcc/config/i386/x86-tune-costs.h +++ b/gcc/config/i386/x86-tune-costs.h @@ -1610,7 +1610,7 @@ struct processor_costs skylake_cost = { in 32,64,128,256 and 512-bit */ {8, 8, 8, 12, 24}, /* cost of storing SSE registers in 32,64,128,256 and 512-bit */ - 2, 2, /* SSE->integer and integer->SSE moves */ + 6, 6, /* SSE->integer and integer->SSE moves */ /* End of register allocator costs. */ }, @@ -2555,7 +2555,7 @@ struct processor_costs core_cost = { in 32,64,128,256 and 512-bit */ {6, 6, 6, 6, 12}, /* cost of storing SSE registers in 32,64,128,256 and 512-bit */ - 2, 2, /* SSE->integer and integer->SSE moves */ + 6, 6, /* SSE->integer and integer->SSE moves */ /* End of register allocator costs. */ },