Well, in your modified example, it is still due to jump threading that produce code of bad control flow that cannot be if-converted and vectorized, though in tree-vrp pass this time.
Try this ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details -ftree-vectorize -save-temps -fno-tree-vrp The code can be vectorized. Grep "threading" in gcc, it seems that dom and vrp passes are two places that apply jump threading. Any other place? I think we need an target hook to control it. Thanks, Bingfeng -----Original Message----- From: Andrew Pinski [mailto:pins...@gmail.com] Sent: 21 November 2013 21:26 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei <b...@broadcom.com> wrote: > Hi, > I am doing some investigation on loops can be vectorized > by LLVM, but not GCC. One example is loop that contains > more than one if-else constructs. > > typedef signed char int8; > #define FFT 128 > > typedef struct { > int8 exp[FFT]; > } feq_t; > > void test(feq_t *feq) > { > int k; > int feqMinimum = 15; > int8 *exp = feq->exp; > > for (k=0;k<FFT;k++) { > exp[k] -= feqMinimum; > if(exp[k]<-15) exp[k] = -15; > if(exp[k]>15) exp[k] = 15; > } > } > > Compile it with 4.8.2 on x86_64 > ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps > > It is not vectorized because if-else constructs are not properly > if-converted. Looking into .ifcvt file, I found the loop is not > if-converted because of bad if-else structure. One branch jumps directly > into another branch. Digging a bit deeper, I found such structure > is generated by dom1 pass doing jump threading optimization. > So recompile with > > ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-dominator-opts > > It is magically if-converted and vectorized! Same on our target, > performance is improved greatly in this example. > > It seems to me that doing jump threading for architectures > support if-conversion is not a good idea. Original if-else structures > are damaged so that if-conversion cannot proceed, so are vectorization > and maybe other optimizations. Should we try to identify those "bad" > jump threading and skip them for such architectures? This is not a bad jump threading at all. In fact I think this is just a misoptimization exposed by DOM. Rewriting it like: #define FFT 128 typedef struct { signed char exp[FFT]; } feq_t; void test(feq_t *feq) { int k; int feqMinimum = 15; signed char *exp = feq->exp; for (k=0;k<FFT;k++) { signed char temp = exp[k] - feqMinimum; if(temp<-15) temp = -15; if(temp>15) temp = 15; exp[k] = temp; } } --- CUT ---- Also shows the issue even without any jump threading involved (turning off DOM does not fix my example). Please file a bug with both your and my examples. Also what DOM is doing is getting rid of the extra store to exp[k] in some cases. > > Bingfeng Mei > Broadcom UK > > >