On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener
<richard.guent...@gmail.com> wrote:
> On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davi...@google.com> wrote:
>> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener
>> <richard.guent...@gmail.com> wrote:
>>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davi...@google.com> 
>>> wrote:
>>>> Currently -ftree-vectorize turns on both loop and slp vectorizations,
>>>> but there is no simple way to turn on loop vectorization alone. The
>>>> logic for default O3 setting is also complicated.
>>>>
>>>> In this patch, two new options are introduced:
>>>>
>>>> 1) -ftree-loop-vectorize
>>>>
>>>> This option is used to turn on loop vectorization only. option
>>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny
>>>> business of Init(2) is needed.  With this change, -ftree-vectorize
>>>> becomes a simple alias to -ftree-loop-vectorize +
>>>> -ftree-slp-vectorize.
>>>>
>>>> For instance, to turn on only slp vectorize at O3, the old way is:
>>>>
>>>>      -O3 -fno-tree-vectorize -ftree-slp-vectorize
>>>>
>>>> With the new change it becomes:
>>>>
>>>>     -O3 -fno-loop-vectorize
>>>>
>>>>
>>>> To turn on only loop vectorize at O2, the old way is
>>>>
>>>>     -O2 -ftree-vectorize -fno-slp-vectorize
>>>>
>>>> The new way is
>>>>
>>>>     -O2 -ftree-loop-vectorize
>>>>
>>>>
>>>>
>>>> 2) -ftree-vect-loop-peeling
>>>>
>>>> This option is used to turn on/off loop peeling for alignment.  In the
>>>> long run, this should be folded into the cheap cost model proposed by
>>>> Richard.  This option is also useful in scenarios where peeling can
>>>> introduce runtime problems:
>>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html  which happens to be
>>>> common in practice.
>>>>
>>>>
>>>>
>>>> Patch attached. Compiler boostrapped. Ok after testing?
>>>
>>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2).
>>
>> Ok. Can you also comment on 2) ?
>
> I think we want to decide how granular we want to control the vectorizer
> and using which mechanism.  My cost-model re-org makes
> ftree-vect-loop-version a no-op (basically removes it), so 2) looks like
> a step backwards in this context.

Using cost model to do a coarse grain control/configuration is
certainly something we want, but having a fine grain control is still
useful.

>
> So, can you summarize what pieces (including versioning) of the vectorizer
> you'd want to be able to disable separately?

Loop peeling seems to be the main one. There is also a correctness
issue related. For instance, the following code is common in practice,
but loop peeling wrongly assumes initial base-alignment and generates
aligned mov instruction after peeling, leading to SEGV.  Peeling is
not something we can blindly turned on -- even when it is on, there
should be a way to turn it off explicitly:

char a[10000];

void foo(int n)
{
  int* b = (int*)(a+n);
  int i = 0;
  for (; i < 1000; ++i)
    b[i] = 1;
}

int main(int argn, char** argv)
{
  foo(argn);
}



>  Just disabling peeling for
> alignment may get you into the versioning for alignment path (and thus
> an unvectorized loop at runtime).

This is not true for target supporting mis-aligned access. I have not
seen a case where alignment driver loop version happens on x86.

>Also it's know that the alignment peeling
> code needs some serious TLC (it's outcome depends on the order of DRs,
> the cost model it uses leaves to be desired as we cannot distinguish
> between unaligned load and store costs).

Yet another reason to turn it off as it is not effective anyways?


thanks,

David

>
> Richard.
>
>>>
>>> I've stopped a quick try doing 1) myself because
>>>
>>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options
>>>          opts->x_flag_ipa_reference = false;
>>>        break;
>>>
>>> +    case OPT_ftree_vectorize:
>>> +      if (!opts_set->x_flag_tree_loop_vectorize)
>>> + opts->x_flag_tree_loop_vectorize = value;
>>> +      if (!opts_set->x_flag_tree_slp_vectorize)
>>> + opts->x_flag_tree_slp_vectorize = value;
>>> +      break;
>>>
>>> doesn't look obviously correct.  Does that handle
>>>
>>>   -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize
>>>
>>> or
>>>
>>>   -ftree-loop-vectorize -fno-tree-vectorize
>>>
>>> properly?  Currently at least
>>>
>>>   -ftree-slp-vectorize -fno-tree-vectorize
>>>
>>> doesn't "work".
>>
>>
>> Right -- same is true for -fprofile-use option. FDO enables some
>> passes, but can not re-enable them if they are flipped off before.
>>
>>>
>>> That said, the option machinery doesn't handle an option being an alias
>>> for two other options, so it's mechanism to contract positives/negatives
>>> doesn't work here and the override hooks do not work reliably for
>>> repeated options.
>>>
>>> Or am I wrong here?  Should we care at all?  Joseph?
>>
>> We should probably just document the behavior. Even better, we should
>> deprecate the old option.
>>
>> thanks,
>>
>> David
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>>
>>>> thanks,
>>>>
>>>> David

Reply via email to