The following fixes bogus differences in scalar iteration cost computed by the vectorizer when comparing 128 and 256 bit vectorizations. This was caused by the hook looking at the vector types mode even for kind == scalar_stmt and thus returning vector operation costs for things like add or negate.
This is most noticable on targets where ix86_vec_cost applies multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus Zen and Bulldozer). But it will also adjust the actual costs everywhere where scalar and vector costs diverge. The adjustments done for Silvermont also look broken since they are applied to both scalar and vector cost which makes it mostly a no-op. The patch adjusts it to only apply for vector costing but of course I didn't benchmark this and the magic number of 1.7 may not make sense after this fix so I'm happy to leave that out - Kirill? Should I just go ahead with that for trunk (we can revert or adjust if autotesters pick up regressions on your side)? Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK? Richard. 2018-10-04 Richard Biener <rguent...@suse.de> * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost is asked for initialize mode to the component mode of the vector type. Make sure Bonnel and esp. other Atom cost adjustments are not done for scalar cost estimates. Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 264837) +++ gcc/config/i386/i386.c (working copy) @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun { unsigned *cost = (unsigned *) data; unsigned retval = 0; + bool scalar_p + = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store); tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; int stmt_cost = - 1; bool fp = false; - machine_mode mode = TImode; + machine_mode mode = scalar_p ? SImode : TImode; if (vectype != NULL) { fp = FLOAT_TYPE_P (vectype); mode = TYPE_MODE (vectype); + if (scalar_p) + mode = TYPE_MODE (TREE_TYPE (vectype)); } if ((kind == vector_stmt || kind == scalar_stmt) @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); /* Penalize DFmode vector operations for Bonnell. */ - if (TARGET_BONNELL && kind == vector_stmt + if (TARGET_BONNELL && !scalar_p && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode) stmt_cost *= 5; /* FIXME: The value here is arbitrary. */ @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun for Silvermont as it has out of order integer pipeline and can execute 2 scalar instruction per tick, but has in order SIMD pipeline. */ if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS - || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt) + || TARGET_TREMONT || TARGET_INTEL) + && !scalar_p && stmt_info && stmt_info->stmt) { tree lhs_op = gimple_get_lhs (stmt_info->stmt); if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)