On Fri, Nov 22, 2013 at 12:03 PM, Bingfeng Mei <b...@broadcom.com> wrote: > Well, in your modified example, it is still due to jump threading that produce > code of bad control flow that cannot be if-converted and vectorized, though in > tree-vrp pass this time. > > Try this > ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-vrp > > The code can be vectorized. > > Grep "threading" in gcc, it seems that dom and vrp passes are two places that > apply > jump threading. Any other place? I think we need an target hook to control it.
Surely not. It's just the usual phase ordering issue that cannot be avoided in all cases. Fix if-conversion instead. Richard. > Thanks, > Bingfeng > > -----Original Message----- > From: Andrew Pinski [mailto:pins...@gmail.com] > Sent: 21 November 2013 21:26 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: Jump threading in tree dom pass prevents if-conversion & > following vectorization > > On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei <b...@broadcom.com> wrote: >> Hi, >> I am doing some investigation on loops can be vectorized >> by LLVM, but not GCC. One example is loop that contains >> more than one if-else constructs. >> >> typedef signed char int8; >> #define FFT 128 >> >> typedef struct { >> int8 exp[FFT]; >> } feq_t; >> >> void test(feq_t *feq) >> { >> int k; >> int feqMinimum = 15; >> int8 *exp = feq->exp; >> >> for (k=0;k<FFT;k++) { >> exp[k] -= feqMinimum; >> if(exp[k]<-15) exp[k] = -15; >> if(exp[k]>15) exp[k] = 15; >> } >> } >> >> Compile it with 4.8.2 on x86_64 >> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details >> -ftree-vectorize -save-temps >> >> It is not vectorized because if-else constructs are not properly >> if-converted. Looking into .ifcvt file, I found the loop is not >> if-converted because of bad if-else structure. One branch jumps directly >> into another branch. Digging a bit deeper, I found such structure >> is generated by dom1 pass doing jump threading optimization. >> So recompile with >> >> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details >> -ftree-vectorize -save-temps -fno-tree-dominator-opts >> >> It is magically if-converted and vectorized! Same on our target, >> performance is improved greatly in this example. >> >> It seems to me that doing jump threading for architectures >> support if-conversion is not a good idea. Original if-else structures >> are damaged so that if-conversion cannot proceed, so are vectorization >> and maybe other optimizations. Should we try to identify those "bad" >> jump threading and skip them for such architectures? > > This is not a bad jump threading at all. In fact I think this is just > a misoptimization exposed by DOM. Rewriting it like: > #define FFT 128 > > typedef struct { > signed char exp[FFT]; > } feq_t; > > void test(feq_t *feq) > { > int k; > int feqMinimum = 15; > signed char *exp = feq->exp; > > for (k=0;k<FFT;k++) { > signed char temp = exp[k] - feqMinimum; > if(temp<-15) temp = -15; > if(temp>15) temp = 15; > exp[k] = temp; > } > } > > --- CUT ---- > Also shows the issue even without any jump threading involved (turning > off DOM does not fix my example). Please file a bug with both your > and my examples. > > Also what DOM is doing is getting rid of the extra store to exp[k] in > some cases. > > >> >> Bingfeng Mei >> Broadcom UK >> >> >> >