On Mon, Sep 16, 2013 at 3:13 AM, Richard Biener <richard.guent...@gmail.com> wrote: > On Fri, Sep 13, 2013 at 5:16 PM, Xinliang David Li <davi...@google.com> wrote: >> On Fri, Sep 13, 2013 at 1:30 AM, Richard Biener >> <richard.guent...@gmail.com> wrote: >>> On Thu, Sep 12, 2013 at 10:31 PM, Xinliang David Li <davi...@google.com> >>> wrote: >>>> Currently -ftree-vectorize turns on both loop and slp vectorizations, >>>> but there is no simple way to turn on loop vectorization alone. The >>>> logic for default O3 setting is also complicated. >>>> >>>> In this patch, two new options are introduced: >>>> >>>> 1) -ftree-loop-vectorize >>>> >>>> This option is used to turn on loop vectorization only. option >>>> -ftree-slp-vectorize also becomes a first class citizen, and no funny >>>> business of Init(2) is needed. With this change, -ftree-vectorize >>>> becomes a simple alias to -ftree-loop-vectorize + >>>> -ftree-slp-vectorize. >>>> >>>> For instance, to turn on only slp vectorize at O3, the old way is: >>>> >>>> -O3 -fno-tree-vectorize -ftree-slp-vectorize >>>> >>>> With the new change it becomes: >>>> >>>> -O3 -fno-loop-vectorize >>>> >>>> >>>> To turn on only loop vectorize at O2, the old way is >>>> >>>> -O2 -ftree-vectorize -fno-slp-vectorize >>>> >>>> The new way is >>>> >>>> -O2 -ftree-loop-vectorize >>>> >>>> >>>> >>>> 2) -ftree-vect-loop-peeling >>>> >>>> This option is used to turn on/off loop peeling for alignment. In the >>>> long run, this should be folded into the cheap cost model proposed by >>>> Richard. This option is also useful in scenarios where peeling can >>>> introduce runtime problems: >>>> http://gcc.gnu.org/ml/gcc/2005-12/msg00390.html which happens to be >>>> common in practice. >>>> >>>> >>>> >>>> Patch attached. Compiler boostrapped. Ok after testing? >>> >>> I'd like you to split 1) and 2), mainly because I agree on 1) but not on 2). >> >> Ok. Can you also comment on 2) ? > > I think we want to decide how granular we want to control the vectorizer > and using which mechanism. My cost-model re-org makes > ftree-vect-loop-version a no-op (basically removes it), so 2) looks like > a step backwards in this context.
Using cost model to do a coarse grain control/configuration is certainly something we want, but having a fine grain control is still useful. > > So, can you summarize what pieces (including versioning) of the vectorizer > you'd want to be able to disable separately? Loop peeling seems to be the main one. There is also a correctness issue related. For instance, the following code is common in practice, but loop peeling wrongly assumes initial base-alignment and generates aligned mov instruction after peeling, leading to SEGV. Peeling is not something we can blindly turned on -- even when it is on, there should be a way to turn it off explicitly: char a[10000]; void foo(int n) { int* b = (int*)(a+n); int i = 0; for (; i < 1000; ++i) b[i] = 1; } int main(int argn, char** argv) { foo(argn); } > Just disabling peeling for > alignment may get you into the versioning for alignment path (and thus > an unvectorized loop at runtime). This is not true for target supporting mis-aligned access. I have not seen a case where alignment driver loop version happens on x86. >Also it's know that the alignment peeling > code needs some serious TLC (it's outcome depends on the order of DRs, > the cost model it uses leaves to be desired as we cannot distinguish > between unaligned load and store costs). Yet another reason to turn it off as it is not effective anyways? thanks, David > > Richard. > >>> >>> I've stopped a quick try doing 1) myself because >>> >>> @@ -1691,6 +1695,12 @@ common_handle_option (struct gcc_options >>> opts->x_flag_ipa_reference = false; >>> break; >>> >>> + case OPT_ftree_vectorize: >>> + if (!opts_set->x_flag_tree_loop_vectorize) >>> + opts->x_flag_tree_loop_vectorize = value; >>> + if (!opts_set->x_flag_tree_slp_vectorize) >>> + opts->x_flag_tree_slp_vectorize = value; >>> + break; >>> >>> doesn't look obviously correct. Does that handle >>> >>> -ftree-vectorize -fno-tree-loop-vectorize -ftree-vectorize >>> >>> or >>> >>> -ftree-loop-vectorize -fno-tree-vectorize >>> >>> properly? Currently at least >>> >>> -ftree-slp-vectorize -fno-tree-vectorize >>> >>> doesn't "work". >> >> >> Right -- same is true for -fprofile-use option. FDO enables some >> passes, but can not re-enable them if they are flipped off before. >> >>> >>> That said, the option machinery doesn't handle an option being an alias >>> for two other options, so it's mechanism to contract positives/negatives >>> doesn't work here and the override hooks do not work reliably for >>> repeated options. >>> >>> Or am I wrong here? Should we care at all? Joseph? >> >> We should probably just document the behavior. Even better, we should >> deprecate the old option. >> >> thanks, >> >> David >> >>> >>> Thanks, >>> Richard. >>> >>>> >>>> thanks, >>>> >>>> David