Hi, Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571258.html BR, Kewen on 2021/6/28 下午3:01, Kewen.Lin via Gcc-patches wrote: > Hi, > > Gentle ping this: > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571258.html > > BR, > Kewen > > on 2021/6/9 上午10:26, Kewen.Lin via Gcc-patches wrote: >> Hi, >> >> Gentle ping this: >> >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571258.html >> >> BR, >> Kewen >> >> on 2021/5/26 上午10:59, Kewen.Lin via Gcc-patches wrote: >>> Hi, >>> >>> This is the updated version of patch to deal with the bwaves_r >>> degradation due to vector construction fed by strided loads. >>> >>> As Richi's comments [1], this follows the similar idea to over >>> price the vector construction fed by VMAT_ELEMENTWISE or >>> VMAT_STRIDED_SLP. Instead of adding the extra cost on vector >>> construction costing immediately, it firstly records how many >>> loads and vectorized statements in the given loop, later in >>> rs6000_density_test (called by finish_cost) it computes the >>> load density ratio against all vectorized stmts, and check >>> with the corresponding thresholds DENSITY_LOAD_NUM_THRESHOLD >>> and DENSITY_LOAD_PCT_THRESHOLD, do the actual extra pricing >>> if both thresholds are exceeded. >>> >>> Note that this new load density heuristic check is based on >>> some fields in target cost which are updated as needed when >>> scanning each add_stmt_cost entry, it's independent of the >>> current function rs6000_density_test which requires to scan >>> non_vect stmts. Since it's checking the load stmts count >>> vs. all vectorized stmts, it's kind of density, so I put >>> it in function rs6000_density_test. With the same reason to >>> keep it independent, I didn't put it as an else arm of the >>> current existing density threshold check hunk or before this >>> hunk. >>> >>> In the investigation of -1.04% degradation from 526.blender_r >>> on Power8, I noticed that the extra penalized cost 320 on one >>> single vector construction with type V16QI is much exaggerated, >>> which makes the final body cost unreliable, so this patch adds >>> one maximum bound for the extra penalized cost for each vector >>> construction statement. >>> >>> Bootstrapped/regtested on powerpc64le-linux-gnu P9. >>> >>> Full SPEC2017 performance evaluation on Power8/Power9 with >>> option combinations: >>> * -O2 -ftree-vectorize {,-fvect-cost-model=very-cheap} {,-ffast-math} >>> * {-O3, -Ofast} {,-funroll-loops} >>> >>> bwaves_r degradations on P8/P9 have been fixed, nothing else >>> remarkable was observed. >>> >>> Is it ok for trunk? >>> >>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570076.html >>> >>> BR, >>> Kewen >>> ----- >>> gcc/ChangeLog: >>> >>> * config/rs6000/rs6000.c (struct rs6000_cost_data): New members >>> nstmts, nloads and extra_ctor_cost. >>> (rs6000_density_test): Add load density related heuristics and the >>> checks, do extra costing on vector construction statements if need. >>> (rs6000_init_cost): Init new members. >>> (rs6000_update_target_cost_per_stmt): New function. >>> (rs6000_add_stmt_cost): Factor vect_nonmem hunk out to function >>> rs6000_update_target_cost_per_stmt and call it. >>> >>