Hi, The problem in PR66119 is that we assume MOVE_RATIO will be constant for a compilation run, such that we only need to read it once at compiler startup if we want to set up defaults for --param sra-max-scalarization-size-Osize and --param sra-max-scalarization-size-Osize.
This assumption is faulty. Some targets may have MOVE_RATIO set up to use state which depends on the selected processor - which may vary per function for a switchable target. This patch fixes the issue by always calling get_move_ratio in the SRA code, ensuring that an up-to-date value is used. Unfortunately, this means we have to use 0 as a sentinel value for the parameter - indicating no user override of the feature - and therefore cannot use it to disable scalarization. However, there are other ways to disable scalarazation (-fno-tree-sra) so this is not a great loss. Bootstrapped and tested on x86-64 and AArch64 with no issues. OK for trunk (and 5.2 after a few days watching for fallout)? Thanks, James --- gcc/ 2015-06-23 James Greenhalgh <james.greenha...@arm.com> PR tree-optimization/66119 * doc/invoke.texi (sra-max-scalarization-size-Osize): Mention that "0" is used as a sentinel value. (sra-max-scalarization-size-Ospeed): Likewise. * toplev.c (process_options): Don't set up default values for the sra_max_scalarization_size_{speed,size} parameters. * tree-sra (analyze_all_variable_accesses): If no values have been set for the sra_max_scalarization_size_{speed,size} parameters, call get_move_ratio to get target defaults.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index b99ab1c..fc9dad7 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10894,7 +10894,9 @@ variables. These parameters control the maximum size, in storage units, of aggregate which is considered for replacement when compiling for speed (@option{sra-max-scalarization-size-Ospeed}) or size -(@option{sra-max-scalarization-size-Osize}) respectively. +(@option{sra-max-scalarization-size-Osize}) respectively. The +value 0 indicates that the compiler should use an appropriate size +for the target processor, this is the default behaviour. @item tm-max-aggregate-size When making copies of thread-local variables in a transaction, this diff --git a/gcc/toplev.c b/gcc/toplev.c index 2f43a89..902bfc7 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1301,20 +1301,6 @@ process_options (void) so we can correctly initialize debug output. */ no_backend = lang_hooks.post_options (&main_input_filename); - /* Set default values for parameters relation to the Scalar Reduction - of Aggregates passes (SRA and IP-SRA). We must do this here, rather - than in opts.c:default_options_optimization as historically these - tuning heuristics have been based on MOVE_RATIO, which on some - targets requires other symbols from the backend. */ - maybe_set_param_value - (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED, - get_move_ratio (true) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - maybe_set_param_value - (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE, - get_move_ratio (false) * UNITS_PER_WORD, - global_options.x_param_values, global_options_set.x_param_values); - /* Some machines may reject certain combinations of options. */ targetm.target_option.override (); diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index 8e34244..e2419af 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -2549,12 +2549,24 @@ analyze_all_variable_accesses (void) bitmap tmp = BITMAP_ALLOC (NULL); bitmap_iterator bi; unsigned i; + bool optimize_speed_p = !optimize_function_for_size_p (cfun); + unsigned max_scalarization_size - = (optimize_function_for_size_p (cfun) - ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE) - : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED)) + = (optimize_speed_p + ? PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SPEED) + : PARAM_VALUE (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE)) * BITS_PER_UNIT; + /* If the user didn't set PARAM_SRA_MAX_SCALARIZATION_SIZE_<...>, + fall back to a target default. This means that zero cannot be + used to disable scalarization as we've taken it as a sentinel + value. This is not ideal, but see PR66119 for the reason we + can't simply set the target defaults ahead of time during option + handling. */ + if (!max_scalarization_size) + max_scalarization_size = get_move_ratio (optimize_speed_p) + * UNITS_PER_WORD * BITS_PER_UNIT; + EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi) if (bitmap_bit_p (should_scalarize_away_bitmap, i) && !bitmap_bit_p (cannot_scalarize_away_bitmap, i))