Richard, I have some comments about the patch.

>   -ftree-vectorizer-verbose=<number>    This switch is deprecated. Use 
> -fopt-info instead.
>
>   ftree-slp-vectorize
> ! Common Report Var(flag_tree_slp_vectorize) Optimization
>   Enable basic block vectorization (SLP) on trees

The code dealing with the interactions between -ftree-vectorize, O3,
etc are complicated and hard to understand. Is it better to change the
meaning of -ftree-vectorize to mean -floop-vectorize only, and make it
independent of -fslp-vectorize?  P


>
> + fvect-cost-model=
> + Common Joined RejectNegative Enum(vect_cost_model) 
> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
> + Specifies the cost model for vectorization
> +
> + Enum
> + Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown 
> vectorizer cost model %qs)
> +
> + EnumValue
> + Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)
> +
> + EnumValue
> + Enum(vect_cost_model) String(dynamic) Value(VECT_COST_MODEL_DYNAMIC)
> +
> + EnumValue
> + Enum(vect_cost_model) String(cheap) Value(VECT_COST_MODEL_CHEAP)

Introducing cheap model is a great change.

> +

> *** 173,179 ****
>   {
>     struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>
> !   if ((unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) == 0)
>       return false;
>
>     if (dump_enabled_p ())
> --- 173,180 ----
>   {
>     struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
>
> !   if (loop_vinfo->cost_model == VECT_COST_MODEL_CHEAP
> !       || (unsigned) PARAM_VALUE (PARAM_VECT_MAX_VERSION_FOR_ALIAS_CHECKS) 
> == 0)
>       return false;
>

When the cost_model == cheap, the alignment peeling should also be
disabled -- there will still be loops that are beneficial to be
vectorized without peeling -- at perhaps reduced net runtime gain.



>   struct gimple_opt_pass pass_slp_vectorize =
> --- 206,220 ----
>   static bool
>   gate_vect_slp (void)
>   {
> !   /* Apply SLP either according to whether the user specified whether to
> !      run SLP or not, or according to whether the user specified whether
> !      to do vectorization or not.  */
> !   if (global_options_set.x_flag_tree_slp_vectorize)
> !     return flag_tree_slp_vectorize != 0;
> !   if (global_options_set.x_flag_tree_vectorize)
> !     return flag_tree_vectorize != 0;
> !   /* And if vectorization was enabled by default run SLP only at -O3.  */
> !   return flag_tree_vectorize != 0 && optimize == 3;
>   }

The logic can be greatly simplified if slp vectorizer is controlled
independently -- easier for user to understand too.


> ! @item -fvect-cost-model=@var{model}
>   @opindex fvect-cost-model
> ! Alter the cost model used for vectorization.  The @var{model} argument
> ! should be one of @code{unlimited}, @code{dynamic} or @code{cheap}.
> ! With the @code{unlimited} model the vectorized code-path is assumed
> ! to be profitable while with the @code{dynamic} model a runtime check
> ! will guard the vectorized code-path to enable it only for iteration
> ! counts that will likely execute faster than when executing the original
> ! scalar loop.  The @code{cheap} model will disable vectorization of
> ! loops where doing so would be cost prohibitive for example due to
> ! required runtime checks for data dependence or alignment but otherwise
> ! is equal to the @code{dynamic} model.
> ! The default cost model depends on other optimization flags and is
> ! either @code{dynamic} or @code{cheap}.
>

Vectorizer in theory will only vectorize a loop with net runtime gain,
so the 'cost' here should only mean code size and compile time cost.

Cheap Model: with this model, the compiler will vectorize loops that
are considered beneficial for runtime performance with minimal code
size increase and compile time cost;
Unlimited Model: compiler will vectorize loops to maximize runtime
gain without considering compile time cost and impact to code size;


thanks,

David

Reply via email to