From: "H.J. Lu"
Check TARGET_USE_VECTOR_FP_CONVERTS or TARGET_USE_VECTOR_CONVERTS when
handling avx_partial_xmm_update attribute. Don't convert AVX partial
XMM register update if vector packed SSE conversion should be used.
gcc/
PR target/101900
* config/i386/i386-features.c (r
From: "H.J. Lu"
Simply memcpy and memset inline strategies to avoid branches for
-mtune=tremont:
1. Create Tremont cost model from generic cost model.
2. With MOVE_RATIO and CLEAR_RATIO == 17, GCC will use integer/vector
load and store for up to 16 * 16 (256) bytes when the data size is
fi
From: "H.J. Lu"
1. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_FP_CONVERTS_DEPENDENCY in SSE FP to FP splitters.
2. Replace TARGET_SSE_PARTIAL_REG_DEPENDENCY with
TARGET_SSE_PARTIAL_REG_CONVERTS_DEPENDENCY in SSE INT to FP splitters.
3. Also check TARGET_SSE_PARTIAL_REG
From: "H.J. Lu"
Initial -mtune=tremont update
1. Use Haswell scheduling model.
2. Assume that stack engine allows to execute push&pop instructions in
parall.
3. Prepare for scheduling pass as -mtune=generic.
4. Use the same issue rate as -mtune=generic.
5. Enable partial_reg_dependency.
6. Disab
From: "Cui,Lili"
Hi,
I have four patches for tremont tuning, With all patches applied,
performance impacts on SPEC CPU 2017 are:
500.perlbench_r 1.81%
502.gcc_r 0.57%
505.mcf_r 1.16%
520.omnetpp_r 0.00%
523.xalancbmk_r 0.