RE: rs6000: load_multiple code
Paulo Matos > -Original Message- > From: Alan Modra [mailto:amo...@gmail.com] > Sent: 22 November 2013 04:42 > To: Paulo Matos > Cc: gcc@gcc.gnu.org > Subject: Re: rs6000: load_multiple code > > On Wed, Nov 20, 2013 at 05:06:13PM +, Paulo Matos wrote: > > I am looking into how rs6000 implements load multiple code > [snip] > > No pseudos are involved for the destination. See the FAIL in > rs6000.md load_multiple. Right, I missed that bit: if (... || REGNO (operands[0]) >= 32) FAIL; This will basically never match at expand time then, and will have little, if any, use before register allocation then. Right? > > -- > Alan Modra > Australia Development Lab, IBM
Re: [doc] Fixing reference inside Extended-Asm.html
Is the version of texinfo buggy to generate online documentation? Sorry for the delayed response. I was hoping the gcc expert on docs would respond so I could see who that was. I have been doing some work on Extended-Asm.html (see the work in progress at http://www.limegreensocks.com/gcc/Using-Assembly-Language-with-C.html) and I haven't had a problem generating output. dw
RE: Jump threading in tree dom pass prevents if-conversion & following vectorization
Well, in your modified example, it is still due to jump threading that produce code of bad control flow that cannot be if-converted and vectorized, though in tree-vrp pass this time. Try this ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details -ftree-vectorize -save-temps -fno-tree-vrp The code can be vectorized. Grep "threading" in gcc, it seems that dom and vrp passes are two places that apply jump threading. Any other place? I think we need an target hook to control it. Thanks, Bingfeng -Original Message- From: Andrew Pinski [mailto:pins...@gmail.com] Sent: 21 November 2013 21:26 To: Bingfeng Mei Cc: gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei wrote: > Hi, > I am doing some investigation on loops can be vectorized > by LLVM, but not GCC. One example is loop that contains > more than one if-else constructs. > > typedef signed char int8; > #define FFT 128 > > typedef struct { > int8 exp[FFT]; > } feq_t; > > void test(feq_t *feq) > { > int k; > int feqMinimum = 15; > int8 *exp = feq->exp; > > for (k=0;k exp[k] -= feqMinimum; > if(exp[k]<-15) exp[k] = -15; > if(exp[k]>15) exp[k] = 15; > } > } > > Compile it with 4.8.2 on x86_64 > ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps > > It is not vectorized because if-else constructs are not properly > if-converted. Looking into .ifcvt file, I found the loop is not > if-converted because of bad if-else structure. One branch jumps directly > into another branch. Digging a bit deeper, I found such structure > is generated by dom1 pass doing jump threading optimization. > So recompile with > > ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-dominator-opts > > It is magically if-converted and vectorized! Same on our target, > performance is improved greatly in this example. > > It seems to me that doing jump threading for architectures > support if-conversion is not a good idea. Original if-else structures > are damaged so that if-conversion cannot proceed, so are vectorization > and maybe other optimizations. Should we try to identify those "bad" > jump threading and skip them for such architectures? This is not a bad jump threading at all. In fact I think this is just a misoptimization exposed by DOM. Rewriting it like: #define FFT 128 typedef struct { signed char exp[FFT]; } feq_t; void test(feq_t *feq) { int k; int feqMinimum = 15; signed char *exp = feq->exp; for (k=0;k15) temp = 15; exp[k] = temp; } } --- CUT Also shows the issue even without any jump threading involved (turning off DOM does not fix my example). Please file a bug with both your and my examples. Also what DOM is doing is getting rid of the extra store to exp[k] in some cases. > > Bingfeng Mei > Broadcom UK > > >
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 1:57 AM, Jonathan Wakely wrote: > On 21 November 2013 21:17, Peter Bergner wrote: >> On Thu, 2013-11-21 at 16:03 -0500, David Edelsohn wrote: >>> Looks like another issue for the libsanitizer maintainers. >> >> I've been doing bootstraps, but didn't see this because the >> kernel header linux/vt.h use on the RHEL6 system I was doing >> builds on has that field renamed. Looking at our SLES11 >> devel system I do see the problematic header file. > > Yes, it only seems to be a problem with SUSE kernels: > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html As my bugreport is being ignored it would help if one ouf our partners (hint! hint!) would raise this issue via the appropriate channel ;) Richard.
Re: build broken on ppc linux?!
> >>> Looks like another issue for the libsanitizer maintainers. > >> > >> I've been doing bootstraps, but didn't see this because the > >> kernel header linux/vt.h use on the RHEL6 system I was doing > >> builds on has that field renamed. Looking at our SLES11 > >> devel system I do see the problematic header file. > > > > Yes, it only seems to be a problem with SUSE kernels: > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html > > As my bugreport is being ignored it would help if one ouf our > partners (hint! hint!) would raise this issue via the appropriate > channel ;) BTW I do not know if this is related, but my build of GCC is stuck currently with this error message: << /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: Assembler messages: /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: Error: .cfi_endproc without corresponding .cfi_startproc :21485: Error: open CFI at the end of file; missing .cfi_endproc directive make[4]: *** [sanitizer_linux.lo] Error 1 >> Would appreciate a fix/work around! Arno
GCC 4.9.0 Status Report (2013-11-22), Trunk in Stage 3 NOW
Status == The trunk is now in Stage 3. To repeat what that means: the trunk is open for general bugfixing, no new features should be added at this point. For exceptions consult your friendly release managers. We have been in Stage 1 for 8 months now. Now is time to look into one of the gazillion regressions we have accumulated. After about two months of general bugfixing the trunk will go into Stage 4 aka "branch state" where only regression and documentation fixes are allowed. When we reach the requirement for a release candidate, which is to end up with zero P1 bugs, the 4.9 branch will be created and Stage 1 will open again. I have re-prioritized regressions, so here is quality data with a delta from the current 4.8 branch (trying out sth new ...). Quality Data Priority # Change from 4.8 branch status --- --- P1 63+ 63 P2 136 +- 0 P3 11- 14 P4 88+ 2 P5 60+ 7 --- --- Total 358+ 58 Previous Report === http://gcc.gnu.org/ml/gcc/2013-10/msg00224.html The next report will be sent when we leave Stage 3.
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
On Fri, Nov 22, 2013 at 11:03:22AM +, Bingfeng Mei wrote: > Well, in your modified example, it is still due to jump threading that produce > code of bad control flow that cannot be if-converted and vectorized, though in > tree-vrp pass this time. > > Try this > ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-vrp > > The code can be vectorized. > > Grep "threading" in gcc, it seems that dom and vrp passes are two places that > apply > jump threading. Any other place? I think we need an target hook to control > it. > You can effectively disable jump-threading using: --param max-jump-thread-duplication-stmts=0 (grep dump files for "Jumps threaded") I don't see Andrew's code vectorized even with jump-threading disabled so I think Andrew is correct and this is some other missed optimization. James
Re: build broken on ppc linux?!
> Would appreciate a fix/work around! Configure with --disable-libsanitizer. -- Eric Botcazou
Re: build broken on ppc linux?!
> > Would appreciate a fix/work around! > > Configure with --disable-libsanitizer. Will do, thanks.
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote: > > >>> Looks like another issue for the libsanitizer maintainers. > > >> > > >> I've been doing bootstraps, but didn't see this because the > > >> kernel header linux/vt.h use on the RHEL6 system I was doing > > >> builds on has that field renamed. Looking at our SLES11 > > >> devel system I do see the problematic header file. > > > > > > Yes, it only seems to be a problem with SUSE kernels: > > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html > > > > As my bugreport is being ignored it would help if one ouf our > > partners (hint! hint!) would raise this issue via the appropriate > > channel ;) > > BTW I do not know if this is related, but my build of GCC is stuck > currently with this error message: > > << > /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: > Assembler messages: > /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: > Error: .cfi_endproc without corresponding .cfi_startproc > :21485: Error: open CFI at the end of file; missing .cfi_endproc directive > make[4]: *** [sanitizer_linux.lo] Error 1 > >> > > Would appreciate a fix/work around! I guess something like this could fix this. Though, no idea if clang has any similar macro, or if llvm always uses .cfi_* directives, or what. Certainly for GCC, if __GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them (either as doesn't support them, or gcc simply hasn't been configured to use them, etc.). In that case GCC emits .eh_frame by hand, and it isn't really possible to tweak that. Kostya? --- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 11:31:00.154740857 +0100 +++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 12:50:50.107420695 +0100 @@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v *%r8 = new_tls, *%r10 = child_tidptr) */ +#ifdef __GCC_HAVE_DWARF2_CFI_ASM ".cfi_endproc\n" +#endif "syscall\n" /* if (%rax != 0) @@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v "jnz1f\n" /* In the child. Terminate unwind chain. */ +#ifdef __GCC_HAVE_DWARF2_CFI_ASM ".cfi_startproc\n" ".cfi_undefined %%rip;\n" +#endif "xorq %%rbp,%%rbp\n" /* Call "fn(arg)". */ Jakub
RE: Jump threading in tree dom pass prevents if-conversion & following vectorization
Yes, it can be vectorized with your suggestion. ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details -ftree-vectorize -save-temps --param max-jump-thread-duplication-stmts=0 See attached assemble file. Bingfeng -Original Message- From: James Greenhalgh [mailto:james.greenha...@arm.com] Sent: 22 November 2013 11:50 To: Bingfeng Mei Cc: Andrew Pinski; gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On Fri, Nov 22, 2013 at 11:03:22AM +, Bingfeng Mei wrote: > Well, in your modified example, it is still due to jump threading that produce > code of bad control flow that cannot be if-converted and vectorized, though in > tree-vrp pass this time. > > Try this > ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-vrp > > The code can be vectorized. > > Grep "threading" in gcc, it seems that dom and vrp passes are two places that > apply > jump threading. Any other place? I think we need an target hook to control > it. > You can effectively disable jump-threading using: --param max-jump-thread-duplication-stmts=0 (grep dump files for "Jumps threaded") I don't see Andrew's code vectorized even with jump-threading disabled so I think Andrew is correct and this is some other missed optimization. James vect-ifconv-2.s Description: vect-ifconv-2.s
Re: build broken on ppc linux?!
> As my bugreport is being ignored it would help if one ouf our Sorry. Which one? > partners (hint! hint!) would raise this issue via the appropriate > channel ;) > > Richard.
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 3:56 PM, Jakub Jelinek wrote: > On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote: >> > >>> Looks like another issue for the libsanitizer maintainers. >> > >> >> > >> I've been doing bootstraps, but didn't see this because the >> > >> kernel header linux/vt.h use on the RHEL6 system I was doing >> > >> builds on has that field renamed. Looking at our SLES11 >> > >> devel system I do see the problematic header file. >> > > >> > > Yes, it only seems to be a problem with SUSE kernels: >> > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html >> > >> > As my bugreport is being ignored it would help if one ouf our >> > partners (hint! hint!) would raise this issue via the appropriate >> > channel ;) >> >> BTW I do not know if this is related, but my build of GCC is stuck >> currently with this error message: >> >> << >> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: >> Assembler messages: >> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: >> Error: .cfi_endproc without corresponding .cfi_startproc >> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive >> make[4]: *** [sanitizer_linux.lo] Error 1 >> >> >> >> Would appreciate a fix/work around! > > I guess something like this could fix this. > Though, no idea if clang has any similar macro, or if llvm always > uses .cfi_* directives, or what. Certainly for GCC, if > __GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them > (either as doesn't support them, or gcc simply hasn't been configured to use > them, etc.). In that case GCC emits .eh_frame by hand, and it isn't really > possible to tweak that. Kostya? > > --- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 > 11:31:00.154740857 +0100 > +++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 > 12:50:50.107420695 +0100 > @@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v > *%r8 = new_tls, > *%r10 = child_tidptr) > */ > +#ifdef __GCC_HAVE_DWARF2_CFI_ASM > ".cfi_endproc\n" > +#endif > "syscall\n" > > /* if (%rax != 0) > @@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v > "jnz1f\n" > > /* In the child. Terminate unwind chain. */ > +#ifdef __GCC_HAVE_DWARF2_CFI_ASM > ".cfi_startproc\n" > ".cfi_undefined %%rip;\n" > +#endif > "xorq %%rbp,%%rbp\n" > > /* Call "fn(arg)". */ These CFI directives were completely removed in upstream at http://llvm.org/viewvc/llvm-project?rev=192196&view=rev Strangely, this did not get into the last merge... Anyway, these cfi_* will (should, at least) disappear with the next merge which I hope to do in ~ 1 week. (Or anyone is welcome to delete these now as a separate commit, but please make sure the code matches the one in upstream) --kcc > > Jakub
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 4:31 PM, Konstantin Serebryany wrote: > On Fri, Nov 22, 2013 at 3:56 PM, Jakub Jelinek wrote: >> On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote: >>> > >>> Looks like another issue for the libsanitizer maintainers. >>> > >> >>> > >> I've been doing bootstraps, but didn't see this because the >>> > >> kernel header linux/vt.h use on the RHEL6 system I was doing >>> > >> builds on has that field renamed. Looking at our SLES11 >>> > >> devel system I do see the problematic header file. >>> > > >>> > > Yes, it only seems to be a problem with SUSE kernels: >>> > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html >>> > >>> > As my bugreport is being ignored it would help if one ouf our >>> > partners (hint! hint!) would raise this issue via the appropriate >>> > channel ;) >>> >>> BTW I do not know if this is related, but my build of GCC is stuck >>> currently with this error message: >>> >>> << >>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: >>> Assembler messages: >>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: >>> Error: .cfi_endproc without corresponding .cfi_startproc >>> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive >>> make[4]: *** [sanitizer_linux.lo] Error 1 >>> >> >>> >>> Would appreciate a fix/work around! >> >> I guess something like this could fix this. >> Though, no idea if clang has any similar macro, or if llvm always >> uses .cfi_* directives, or what. Certainly for GCC, if >> __GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them >> (either as doesn't support them, or gcc simply hasn't been configured to use >> them, etc.). In that case GCC emits .eh_frame by hand, and it isn't really >> possible to tweak that. Kostya? >> >> --- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 >> 11:31:00.154740857 +0100 >> +++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 >> 12:50:50.107420695 +0100 >> @@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v >> *%r8 = new_tls, >> *%r10 = child_tidptr) >> */ >> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM >> ".cfi_endproc\n" >> +#endif >> "syscall\n" >> >> /* if (%rax != 0) >> @@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v >> "jnz1f\n" >> >> /* In the child. Terminate unwind chain. */ >> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM >> ".cfi_startproc\n" >> ".cfi_undefined %%rip;\n" >> +#endif >> "xorq %%rbp,%%rbp\n" >> >> /* Call "fn(arg)". */ > > These CFI directives were completely removed in upstream at > http://llvm.org/viewvc/llvm-project?rev=192196&view=rev > Strangely, this did not get into the last merge... Ah, no surprise. The merge was done from llvm's r191666, which is earlier than 192196 > > Anyway, these cfi_* will (should, at least) disappear with the next > merge which I hope to do in ~ 1 week. > (Or anyone is welcome to delete these now as a separate commit, but > please make sure the code matches the one in upstream) > > --kcc > > >> >> Jakub
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote: > > As my bugreport is being ignored it would help if one ouf our > > Sorry. Which one? I believe richi meant https://bugzilla.novell.com/show_bug.cgi?id=849180 Martin > > > partners (hint! hint!) would raise this issue via the appropriate > > channel ;) :-)
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor wrote: > On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote: >> > As my bugreport is being ignored it would help if one ouf our >> >> Sorry. Which one? > > I believe richi meant > https://bugzilla.novell.com/show_bug.cgi?id=849180 I don't have access there.
Re: proposal to make SIZE_TYPE more flexible
On Fri, 22 Nov 2013, DJ Delorie wrote: > If I come up with some table-driven API to register > "integer-like-types" and search/sort/choose from them, would that be a > good starting point? Then we can #define *_type_node to a function > call perhaps. I am doubtful that it's appropriate for e.g. integer_type_node to be a function call. I can believe it makes sense for int128_integer_type_node to be such a call (more precisely, for int128_integer_type_node to cease to exist and for any front-end places needing it to call a function, with a type size that should not be a constant 128). I can also believe it's appropriate for the global nodes for trees reflecting C ABI types to go somewhere other than tree.h. I've no idea whether a table-driven API for anything would be a good starting point. That depends on a detailed analysis of the current situation and its deficiencies for whatever you are proposing replacing with such an API. I *am* reasonably confident that the places handling hardcoded lists of intQI_type_node, intHI_type_node, ... would better iterate over whatever supported integer modes may be present in the particular compiler configuration (and have some set of signed / unsigned / atomic types associated with integer modes) rather than hardcoding a list. It would not surprise me if some of the global type nodes either aren't needed at all or, being only used for built-in functions, should actually be defined in builtin-types.def rather than tree.[ch]. For example, complex_integer_type_node and float_ptr_type_node. But I don't think cleaning up those would actually help in any way towards your goal; it would be a completely orthogonal cleanup. -- Joseph S. Myers jos...@codesourcery.com
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
On Fri, Nov 22, 2013 at 12:03 PM, Bingfeng Mei wrote: > Well, in your modified example, it is still due to jump threading that produce > code of bad control flow that cannot be if-converted and vectorized, though in > tree-vrp pass this time. > > Try this > ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-vrp > > The code can be vectorized. > > Grep "threading" in gcc, it seems that dom and vrp passes are two places that > apply > jump threading. Any other place? I think we need an target hook to control it. Surely not. It's just the usual phase ordering issue that cannot be avoided in all cases. Fix if-conversion instead. Richard. > Thanks, > Bingfeng > > -Original Message- > From: Andrew Pinski [mailto:pins...@gmail.com] > Sent: 21 November 2013 21:26 > To: Bingfeng Mei > Cc: gcc@gcc.gnu.org > Subject: Re: Jump threading in tree dom pass prevents if-conversion & > following vectorization > > On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei wrote: >> Hi, >> I am doing some investigation on loops can be vectorized >> by LLVM, but not GCC. One example is loop that contains >> more than one if-else constructs. >> >> typedef signed char int8; >> #define FFT 128 >> >> typedef struct { >> int8 exp[FFT]; >> } feq_t; >> >> void test(feq_t *feq) >> { >> int k; >> int feqMinimum = 15; >> int8 *exp = feq->exp; >> >> for (k=0;k> exp[k] -= feqMinimum; >> if(exp[k]<-15) exp[k] = -15; >> if(exp[k]>15) exp[k] = 15; >> } >> } >> >> Compile it with 4.8.2 on x86_64 >> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details >> -ftree-vectorize -save-temps >> >> It is not vectorized because if-else constructs are not properly >> if-converted. Looking into .ifcvt file, I found the loop is not >> if-converted because of bad if-else structure. One branch jumps directly >> into another branch. Digging a bit deeper, I found such structure >> is generated by dom1 pass doing jump threading optimization. >> So recompile with >> >> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details >> -ftree-vectorize -save-temps -fno-tree-dominator-opts >> >> It is magically if-converted and vectorized! Same on our target, >> performance is improved greatly in this example. >> >> It seems to me that doing jump threading for architectures >> support if-conversion is not a good idea. Original if-else structures >> are damaged so that if-conversion cannot proceed, so are vectorization >> and maybe other optimizations. Should we try to identify those "bad" >> jump threading and skip them for such architectures? > > This is not a bad jump threading at all. In fact I think this is just > a misoptimization exposed by DOM. Rewriting it like: > #define FFT 128 > > typedef struct { > signed char exp[FFT]; > } feq_t; > > void test(feq_t *feq) > { > int k; > int feqMinimum = 15; > signed char *exp = feq->exp; > > for (k=0;k signed char temp = exp[k] - feqMinimum; > if(temp<-15) temp = -15; > if(temp>15) temp = 15; > exp[k] = temp; > } > } > > --- CUT > Also shows the issue even without any jump threading involved (turning > off DOM does not fix my example). Please file a bug with both your > and my examples. > > Also what DOM is doing is getting rid of the extra store to exp[k] in > some cases. > > >> >> Bingfeng Mei >> Broadcom UK >> >> >> >
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 1:36 PM, Konstantin Serebryany wrote: > On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor wrote: >> On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote: >>> > As my bugreport is being ignored it would help if one ouf our >>> >>> Sorry. Which one? >> >> I believe richi meant >> https://bugzilla.novell.com/show_bug.cgi?id=849180 > > I don't have access there. The hint was directed at the IBM people. Richard.
Re: rs6000: load_multiple code
On Fri, Nov 22, 2013 at 09:31:18AM +, Paulo Matos wrote: > > From: Alan Modra [mailto:amo...@gmail.com] > > On Wed, Nov 20, 2013 at 05:06:13PM +, Paulo Matos wrote: > > > I am looking into how rs6000 implements load multiple code > > [snip] > > > > No pseudos are involved for the destination. See the FAIL in > > rs6000.md load_multiple. > > Right, I missed that bit: > if (... > || REGNO (operands[0]) >= 32) > FAIL; > > This will basically never match at expand time then, and will have little, if > any, use before register allocation then. Right? Right. You'll find store_multiple used in function prologues and load_multiple in epilogues, with -Os if the target supports the string insns. movmemsi is of more interest in code elsewhere, and you'll see a comment there about the register allocator. :) -- Alan Modra Australia Development Lab, IBM
Re: build broken on ppc linux?!
Hi, On Fri, Nov 22, 2013 at 04:36:47PM +0400, Konstantin Serebryany wrote: > On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor wrote: > > On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote: > >> > As my bugreport is being ignored it would help if one ouf our > >> > >> Sorry. Which one? > > > > I believe richi meant > > https://bugzilla.novell.com/show_bug.cgi?id=849180 > > I don't have access there. Sorry, although I thought checked I the bug was public, apparently I did it wrong and it is not. Anyway, it is bug against SLES 11 which does not have the kernel patch to make vt.h C++ compilable. Martin
Re: build broken on ppc linux?!
On Fri, 2013-11-22 at 12:30 +0100, Richard Biener wrote: > On Fri, Nov 22, 2013 at 1:57 AM, Jonathan Wakely > wrote: > > Yes, it only seems to be a problem with SUSE kernels: > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html > > As my bugreport is being ignored it would help if one ouf our > partners (hint! hint!) would raise this issue via the appropriate > channel ;) Ok, I'll open a bug on our side and we'll see if that helps move things along. Peter
Re: build broken on ppc linux?!
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: > Assembler messages: > /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: > Error: .cfi_endproc without corresponding .cfi_startproc > :21485: Error: open CFI at the end of file; missing .cfi_endproc directive > make[4]: *** [sanitizer_linux.lo] Error 1 I’ve posted this to the list before, and turns out you need “recent” linux kernel and “recent” binutils to bootstrap GCC these days. But to keep the fun, “recent” is neither document, nor tested at configure time, so you end up with useless error messages. I’ve filed bug reports about it (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59067 and http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068), which have been dutifuly ignored. My opinion is that unless the level of support of libsanitizer is increased, it should not be built by default (or build it only if it’s supported). Causing such bootstrap issues would not be tolerated in other parts of the compiler. FX
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 7:00 PM, FX wrote: >> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: >> Assembler messages: >> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: >> Error: .cfi_endproc without corresponding .cfi_startproc >> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive >> make[4]: *** [sanitizer_linux.lo] Error 1 > > I’ve posted this to the list before, and turns out you need “recent” linux > kernel and “recent” binutils to bootstrap GCC these days. But to keep the > fun, “recent” is neither document, nor tested at configure time, so you end > up with useless error messages. > > I’ve filed bug reports about it > (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59067 and > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068), which have been dutifuly > ignored. My opinion is that unless the level of support of libsanitizer is > increased, it should not be built by default (or build it only if it’s > supported). Causing such bootstrap issues would not be tolerated in other > parts of the compiler. I am all for disabling libsanitizer if something in the tool chain is old (binutils, kernel, compiler, etc). > > FX
RE: cross compile & exceptions
I did this in order to build gcc, libgcc and libstdc++ independently. when I do the simple integrated build process (following http://gcc.gnu.org/install) : cd $(GCC_OBJDIR); CFLAGS="-g -O0" $(GCC_SRCDIR)/configure -quiet --prefix=$(INSTALLDIR) --target=$(TARGET) --enable-languages=c,c++,ada --disable-nls --disable-decimal-float --disable-fixed-point --disable-libmudflap --disable-libffi --disable-libssp --disable-shared --disable-threads --without-headers --disable-libada --enable-version-specific-runtime-lib --disable-bootstrap --enable-checking=release make -C $(GCC_OBJDIR) I encounter a problem on libstdc++v3 : Configuring in prism/libstdc++-v3 Configuring in prism/libiberty configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[2]: *** [configure-target-libiberty] Error 1 make[2]: *** Waiting for unfinished jobs configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES. make[2]: *** [configure-target-libstdc++-v3] Error 1 make[1]: *** [all] Error 2 make: *** [gcc_make] Error 2 because stdio.h is not found (my libc is externally built and --without-header prevent gcc from knowing where are these headers) I tried --with-headers with my own libc header files (incomplete home made libc) but this time I found stuck on libgcc2 requiring unistd.h that I don't have (or want) : In file included from /vues_statiques/FPGA/belbachir/prism2/MPUCores/tools/gcc-4.5.2/libgcc/../gcc/libgcc2.c:29:0: /vues_statiques/FPGA/belbachir/prism2/MPUCores/tools/gcc-4.5.2/libgcc/../gcc/tsystem.h:102:20: fatal error: unistd.h: No such file or directory compilation terminated. So, to build libgcc I would need --without-header to compensate for my small libc, and to build libstdc++ I would have to use --with-header in order to provide stdio.h ... Do you know a better way to solve that than building gcc, libgcc & libstdc++ independently ?
post_inc mem in parallel rtx
Hi, I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3) /* The register should have been invalidated. */ gcc_assert (REG_VALUES (dreg)->elt == 0);<<== assert(false) I investigated the dump and found that the crash occurred during 207r.dse2 pass. Here is what I saw in the previous dump (206r.pro_and_epilogue) : (insn 104 47 105 7 (parallel [ (set (reg:CC_NOOV 56 $CCI) (compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133]) (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])) (const_int 0 [0]))) (set (reg:SI 16 $R0 [153]) (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133]) (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))) ]) Note the post_inc MEM on $R2 appearing twice This rtl match my pattern (predicate and contraint ok) below : (define_insn "subsi3_compare0" [(set (reg:CC_NOOV CCI_REG) (compare:CC_NOOV (minus:SI (match_operand:SI 1 "general_operand" "g") (match_operand:SI 2 " general_operand " " g")) (const_int 0))) (set (match_operand:SI 0 "register_operand" "=r ") (minus:SI (match_dup 1) (match_dup 2)))] But I think It may be an error to authorize post_inc MEM in this parallel rtx in operand 1 & 2. When I put a more restrictive constraint which forbid the use of post_inc, the crash in cselib.c disappear. Question : What does GCC understand when the md describes a pattern allowing the same post_inc MEM in 2 slot of a parallel rtx ? Is it forbidden ? the MEM address is supposed to be incremented twice ? Regards, Selim
Re: cross compile & exceptions
On Fri, Nov 22, 2013 at 8:43 AM, BELBACHIR Selim wrote: > I did this in order to build gcc, libgcc and libstdc++ independently. OK, fair enough. Sorry, I don't know what is happening with your original bug report. Ian
Re: cross compile & exceptions
On 11/22/2013 04:43 PM, BELBACHIR Selim wrote: > > So, to build libgcc I would need --without-header to compensate for my small > libc, and to build libstdc++ I would have to use --with-header in order to > provide stdio.h ... > > > Do you know a better way to solve that than building gcc, libgcc & libstdc++ > independently ? What is $(TARGET) ? Andrew.
RE: cross compile & exceptions
>> >> So, to build libgcc I would need --without-header to compensate for my small >> libc, and to build libstdc++ I would have to use --with-header in order to >> provide stdio.h ... >> >> >> Do you know a better way to solve that than building gcc, libgcc & libstdc++ >> independently ? > What is $(TARGET) ? >Andrew. $(TARGET) is a private embedded platform (cpu/os/lib) Selim
Re: post_inc mem in parallel rtx
On 11/22/13 09:43, BELBACHIR Selim wrote: Hi, I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3) /* The register should have been invalidated. */ gcc_assert (REG_VALUES (dreg)->elt == 0);<<== assert(false) I investigated the dump and found that the crash occurred during 207r.dse2 pass. Here is what I saw in the previous dump (206r.pro_and_epilogue) : (insn 104 47 105 7 (parallel [ (set (reg:CC_NOOV 56 $CCI) (compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133]) (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])) (const_int 0 [0]))) (set (reg:SI 16 $R0 [153]) (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133]) (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))) ]) Note the post_inc MEM on $R2 appearing twice This rtl match my pattern (predicate and contraint ok) below : (define_insn "subsi3_compare0" [(set (reg:CC_NOOV CCI_REG) (compare:CC_NOOV (minus:SI (match_operand:SI 1 "general_operand" "g") (match_operand:SI 2 " general_operand " " g")) (const_int 0))) (set (match_operand:SI 0 "register_operand" "=r ") (minus:SI (match_dup 1) (match_dup 2)))] But I think It may be an error to authorize post_inc MEM in this parallel rtx in operand 1 & 2. When I put a more restrictive constraint which forbid the use of post_inc, the crash in cselib.c disappear. Question : What does GCC understand when the md describes a pattern allowing the same post_inc MEM in 2 slot of a parallel rtx ? Is it forbidden ? the MEM address is supposed to be incremented twice ? I think the semantics are defined by the PARALLEL. Namely that the uses are evaluated, then side effects are performed. So both sets use the value before incrementing. The only question is what is the resulting value, and given the fundamental nature of PARALLEL, I think a single visible side effect is the most obvious answer. Now having said that, there's a distinct possibility various passes don't handle this properly. jeff
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
On 11/22/13 04:03, Bingfeng Mei wrote: Well, in your modified example, it is still due to jump threading that produce code of bad control flow that cannot be if-converted and vectorized, though in tree-vrp pass this time. Try this ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details -ftree-vectorize -save-temps -fno-tree-vrp The code can be vectorized. Grep "threading" in gcc, it seems that dom and vrp passes are two places that apply jump threading. Any other place? I think we need an target hook to control it. No no. The right thing to do is fix if-conversion. jeff
RE: post_inc mem in parallel rtx
Ok so I should avoid the auto_inc alternatives in PARALLEL. It's certainly a quite rare RTL and I doubt the effort worth it. -Message d'origine- De : Jeff Law [mailto:l...@redhat.com] Envoyé : vendredi 22 novembre 2013 17:55 À : BELBACHIR Selim; gcc@gcc.gnu.org Objet : Re: post_inc mem in parallel rtx On 11/22/13 09:43, BELBACHIR Selim wrote: > Hi, > > I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3) > > /* The register should have been invalidated. */ >gcc_assert (REG_VALUES (dreg)->elt == 0);<<== > assert(false) > > > I investigated the dump and found that the crash occurred during 207r.dse2 > pass. > > Here is what I saw in the previous dump (206r.pro_and_epilogue) : > > (insn 104 47 105 7 (parallel [ > (set (reg:CC_NOOV 56 $CCI) > (compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 > ] [133]) > (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 > ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])) > (const_int 0 [0]))) > (set (reg:SI 16 $R0 [153]) > (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133]) > (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 > ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))) > ]) > > Note the post_inc MEM on $R2 appearing twice > > This rtl match my pattern (predicate and contraint ok) below : > > (define_insn "subsi3_compare0" >[(set (reg:CC_NOOV CCI_REG) > (compare:CC_NOOV >(minus:SI > (match_operand:SI 1 "general_operand" "g") > (match_operand:SI 2 " general_operand " " g")) >(const_int 0))) > (set (match_operand:SI 0 "register_operand" "=r ") > (minus:SI >(match_dup 1) >(match_dup 2)))] > > But I think It may be an error to authorize post_inc MEM in this parallel rtx > in operand 1 & 2. > When I put a more restrictive constraint which forbid the use of post_inc, > the crash in cselib.c disappear. > > Question : What does GCC understand when the md describes a pattern allowing > the same post_inc MEM in 2 slot of a parallel rtx ? > Is it forbidden ? the MEM address is supposed to be incremented twice ? I think the semantics are defined by the PARALLEL. Namely that the uses are evaluated, then side effects are performed. So both sets use the value before incrementing. The only question is what is the resulting value, and given the fundamental nature of PARALLEL, I think a single visible side effect is the most obvious answer. Now having said that, there's a distinct possibility various passes don't handle this properly. jeff
Re: post_inc mem in parallel rtx
On 11/22/13 10:03, BELBACHIR Selim wrote: Ok so I should avoid the auto_inc alternatives in PARALLEL. It's certainly a quite rare RTL and I doubt the effort worth it. That'd be my inclination as well. I'm not sure what chip you're working on, but those kind of multiple-output instructions tend to cause all kinds of performance problems once the chip goes to out-of-order execution. Basically most folks designing the chip allow the operations to run independently, but they have to retire as a group. Thus an insn like that would hold 3 slots in the retirement buffer (two outputs plus embedded side effect) until all three operations are ready to retire. That can be a real drag if the memory reference doesn't hit the cache. jeff
RE: Jump threading in tree dom pass prevents if-conversion & following vectorization
So if we are about to fix this in if-conversion, we need to do both in tree & rtl as both ifcvt & ce passes cannot handle it. I am still not convinced jump threading is good for target with predicated execution (assuming no fix for if-conversion). I am doing benchmarking on our target now. Bingfeng -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: 22 November 2013 16:58 To: Bingfeng Mei; Andrew Pinski Cc: gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On 11/22/13 04:03, Bingfeng Mei wrote: > Well, in your modified example, it is still due to jump threading that produce > code of bad control flow that cannot be if-converted and vectorized, though in > tree-vrp pass this time. > > Try this > ~/install-4.8/bin/gcc vect-ifconv-2.c -O2 -fdump-tree-ifcvt-details > -ftree-vectorize -save-temps -fno-tree-vrp > > The code can be vectorized. > > Grep "threading" in gcc, it seems that dom and vrp passes are two places that > apply > jump threading. Any other place? I think we need an target hook to control it. No no. The right thing to do is fix if-conversion. jeff
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
On 11/22/13 10:13, Bingfeng Mei wrote: So if we are about to fix this in if-conversion, we need to do both in tree & rtl as both ifcvt & ce passes cannot handle it. I am still not convinced jump threading is good for target with predicated execution (assuming no fix for if-conversion). I am doing benchmarking on our target now. I'd be quite surprised if your tests show that it's not beneficial. In simplest terms jump threading identifies conditional branches which can have their destination statically determined based on the path taken to the static branch. And more generally, we try *real* hard not to start enabling/disabling tree passes on a per-target basis. The end result if we were to start doing that is an unmaintainable mess. Jeff
Re: cross compile & exceptions
On 11/22/2013 04:54 PM, BELBACHIR Selim wrote: >>> >>> So, to build libgcc I would need --without-header to compensate for my >>> small libc, and to build libstdc++ I would have to use --with-header in >>> order to provide stdio.h ... >>> >>> >>> Do you know a better way to solve that than building gcc, libgcc & >>> libstdc++ independently ? > >> What is $(TARGET) ? > >> Andrew. > > $(TARGET) is a private embedded platform (cpu/os/lib) Right, but GCC is trying to build against unistd.h. It's not going to do that unless you tell it you have a UNIX-like target. I'd start by building GCC against newlib. Andrew.
Re: build broken on ppc linux?!
On Nov 22, 2013, at 4:31 AM, Konstantin Serebryany wrote: > These CFI directives were completely removed in upstream at > http://llvm.org/viewvc/llvm-project?rev=192196&view=rev > Strangely, this did not get into the last merge... > > Anyway, these cfi_* will (should, at least) disappear with the next > merge which I hope to do in ~ 1 week. > (Or anyone is welcome to delete these now as a separate commit, but > please make sure the code matches the one in upstream) This is exactly the patch referenced in the pointer to the upstream repo. Arno, does this fix the build for you? Ok? Index: libsanitizer/sanitizer_common/sanitizer_linux.cc === --- libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 205278) +++ libsanitizer/sanitizer_common/sanitizer_linux.cc(working copy) @@ -785,7 +785,6 @@ uptr internal_clone(int (*fn)(void *), v *%r8 = new_tls, *%r10 = child_tidptr) */ - ".cfi_endproc\n" "syscall\n" /* if (%rax != 0) @@ -795,8 +794,9 @@ uptr internal_clone(int (*fn)(void *), v "jnz1f\n" /* In the child. Terminate unwind chain. */ - ".cfi_startproc\n" - ".cfi_undefined %%rip;\n" + // XXX: We should also terminate the CFI unwind chain + // here. Unfortunately clang 3.2 doesn't support the + // necessary CFI directives, so we skip that part. "xorq %%rbp,%%rbp\n" /* Call "fn(arg)". */
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 10:11:18AM -0800, Mike Stump wrote: > On Nov 22, 2013, at 4:31 AM, Konstantin Serebryany > wrote: > > These CFI directives were completely removed in upstream at > > http://llvm.org/viewvc/llvm-project?rev=192196&view=rev > > Strangely, this did not get into the last merge... > > > > Anyway, these cfi_* will (should, at least) disappear with the next > > merge which I hope to do in ~ 1 week. > > (Or anyone is welcome to delete these now as a separate commit, but > > please make sure the code matches the one in upstream) > > This is exactly the patch referenced in the pointer to the upstream repo. > Arno, does this fix the build for you? > > Ok? Yes (though, I really wonder why it needs to be removed rather than only conditionally added based on preprocessor macros, but that is a question for upstream). Jakub
Re: build broken on ppc linux?!
> This is exactly the patch referenced in the pointer to the upstream repo. > Arno, does this fix the build for you? Well now I encounter: /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: In function '__sanitizer::uptr __sanitizer::internal_filesize(__sanitizer::fd_t)': /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:176:19: warning: 'st.stat::st_size' may be used uninitialized in this function [-Wmaybe-uninitialized] return (uptr)st.st_size; ^ So I guess that's what we call "progress". I'll keep using --disable-libsanitizer for the time being, this library is clearly not quite productized yet IMO. Arno
RE: Jump threading in tree dom pass prevents if-conversion & following vectorization
I understand what jump threading does. In theory it reduces number of instructions executed. But it creates messy program structure and prevents further optimizations, at least for target we have (VLIW-based DSP with predicated execution). I just ran through 8 audio codecs we use as internal benchmark. 5 out of 8 codecs have similar performance with/without jump threading (give or take 0.1-0.2%). For the other 3, no jump threading version outperforms by 1-2.5%. I didn't even enable -ftree-vectorize. I am going to do some further investigation and check whether if-conversion can be fixed without disabling jump threading. Bingfeng -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: 22 November 2013 17:17 To: Bingfeng Mei; Andrew Pinski; Richard Biener Cc: gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On 11/22/13 10:13, Bingfeng Mei wrote: > So if we are about to fix this in if-conversion, we need to do both in tree & > rtl as both ifcvt & ce passes cannot handle it. > > I am still not convinced jump threading is good for target with predicated > execution (assuming no fix for if-conversion). I am doing benchmarking on our > target now. I'd be quite surprised if your tests show that it's not beneficial. In simplest terms jump threading identifies conditional branches which can have their destination statically determined based on the path taken to the static branch. And more generally, we try *real* hard not to start enabling/disabling tree passes on a per-target basis. The end result if we were to start doing that is an unmaintainable mess. Jeff
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
Hey, What is jump threading? I've not heard of it before ( http://en.wikipedia.org/wiki/Jump_threading is basically the description of the compiler flag ) Alec On 22/11/13 19:06, Bingfeng Mei wrote: I understand what jump threading does. In theory it reduces number of instructions executed. But it creates messy program structure and prevents further optimizations, at least for target we have (VLIW-based DSP with predicated execution). I just ran through 8 audio codecs we use as internal benchmark. 5 out of 8 codecs have similar performance with/without jump threading (give or take 0.1-0.2%). For the other 3, no jump threading version outperforms by 1-2.5%. I didn't even enable -ftree-vectorize. I am going to do some further investigation and check whether if-conversion can be fixed without disabling jump threading. Bingfeng -Original Message- From: Jeff Law [mailto:l...@redhat.com] Sent: 22 November 2013 17:17 To: Bingfeng Mei; Andrew Pinski; Richard Biener Cc: gcc@gcc.gnu.org Subject: Re: Jump threading in tree dom pass prevents if-conversion & following vectorization On 11/22/13 10:13, Bingfeng Mei wrote: So if we are about to fix this in if-conversion, we need to do both in tree & rtl as both ifcvt & ce passes cannot handle it. I am still not convinced jump threading is good for target with predicated execution (assuming no fix for if-conversion). I am doing benchmarking on our target now. I'd be quite surprised if your tests show that it's not beneficial. In simplest terms jump threading identifies conditional branches which can have their destination statically determined based on the path taken to the static branch. And more generally, we try *real* hard not to start enabling/disabling tree passes on a per-target basis. The end result if we were to start doing that is an unmaintainable mess. Jeff
Re: proposal to make SIZE_TYPE more flexible
> (more precisely, for int128_integer_type_node to cease to exist and > for any front-end places needing it to call a function, with a type > size that should not be a constant 128). The complications I've seen there is, for example, when you're iterating through types looking for a "best" type, where some of the types are fixed foo_type_node's and others are dynamic intN_type_nodes. Perhaps we sould use a hybrid list-plus-table approach? So we check for the standard types explicitly, then iterate through the list of intN types? > I can also believe it's appropriate for the global nodes for trees > reflecting C ABI types to go somewhere other than tree.h. Which are those? Why isn't "int" one of those? > I've no idea whether a table-driven API for anything would be a good > starting point. That depends on a detailed analysis of the current > situation and its deficiencies for whatever you are proposing replacing > with such an API. If you want to support more than one intN at a time, IMHO you need more than just one intN object, hence a table of some sort. Or are you assuming that any given backend would only be allowed to define one intN type? That's already not going to work, as I need int20_t in addition to the "now standard" int128_t. > I *am* reasonably confident that the places handling hardcoded lists > of intQI_type_node, intHI_type_node, ... would better iterate over > whatever supported integer modes may be present in the particular > compiler configuration (and have some set of signed / unsigned / > atomic types associated with integer modes) rather than hardcoding a > list. How is this different than the places handling hardcoded lists of integer_type_node et al? > It would not surprise me if some of the global type nodes either aren't > needed at all or, being only used for built-in functions, should actually > be defined in builtin-types.def rather than tree.[ch]. For example, > complex_integer_type_node and float_ptr_type_node. But I don't think > cleaning up those would actually help in any way towards your goal; it > would be a completely orthogonal cleanup. Yeah, I'm not trying to take on more work, just trying to hit the prereqs for my own project.
Re: Jump threading in tree dom pass prevents if-conversion & following vectorization
On Fri, Nov 22, 2013 at 6:16 PM, Jeff Law wrote: >> I am still not convinced jump threading is good for target with predicated >> execution (assuming no fix for if-conversion). I am doing benchmarking on >> our target now. Try disabling only jump threading of back edges, loop latches, and jump threading in small loops. Any "jump forwarding" is almost always a win. > I'd be quite surprised if your tests show that it's not beneficial. > > In simplest terms jump threading identifies conditional branches which can > have their destination statically determined based on the path taken to the > static branch. Still, optimizing away such conditional branches is not automatically a win. There have always been issues with tree-ssa DOM doing jump-threading so aggressively that other passes couldn't handle the resulting control flow anymore, especially jump threading around/near loops. Ciao! Steven
Re: build broken on ppc linux?!
On Fri, Nov 22, 2013 at 07:21:07PM +0100, Arnaud Charlet wrote: > > This is exactly the patch referenced in the pointer to the upstream repo. > > Arno, does this fix the build for you? > > Well now I encounter: > > /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: In > function '__sanitizer::uptr > __sanitizer::internal_filesize(__sanitizer::fd_t)': > /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:176:19: > warning: 'st.stat::st_size' may be used uninitialized in this function > [-Wmaybe-uninitialized] >return (uptr)st.st_size; >^ > > So I guess that's what we call "progress". > > I'll keep using --disable-libsanitizer for the time being, this library is > clearly not quite productized yet IMO. Here is a patch to fix various warnings, the remaining ones I'm seeing are mostly that libsanitizer uses incorrectly C90/C++98 ... in macros (the standard require it to be non-empty), either use the GNU extension instead, #define INTERCEPTOR(a, b, c...) and ,## c if needed to get rid of the preceeding comma if empty (though, you compile with -pedantic, so might get warnings about that too), or rework the macros or have different ones for the zero argument cases (INTERCEPTOR0). There are some additional warnings caused by the #ifdef SYSCALL_INTERCEPTION hacks we have to avoid various issues with problematic kernel headers or libsanitizer code not having non-i?86/x86_64 in mind. The sanitizer_syscall_linux_x86_64.inc changes fix real bugs, the rest is just to get the noise level down. --- sanitizer_common/sanitizer_linux.cc.jj 2013-11-12 11:31:00.0 +0100 +++ sanitizer_common/sanitizer_linux.cc 2013-11-22 20:15:26.652376137 +0100 @@ -216,7 +216,7 @@ uptr GetTid() { } u64 NanoTime() { - kernel_timeval tv = {}; + kernel_timeval tv = {0, 0}; internal_syscall(__NR_gettimeofday, (uptr)&tv, 0); return (u64)tv.tv_sec * 1000*1000*1000 + tv.tv_usec * 1000; } --- sanitizer_common/sanitizer_syscall_linux_x86_64.inc.jj 2013-11-12 11:31:00.0 +0100 +++ sanitizer_common/sanitizer_syscall_linux_x86_64.inc 2013-11-22 20:14:32.752657581 +0100 @@ -11,7 +11,7 @@ static uptr internal_syscall(u64 nr) { u64 retval; - asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11"); + asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11", "memory"); return retval; } @@ -19,7 +19,7 @@ template static uptr internal_syscall(u64 nr, T1 arg1) { u64 retval; asm volatile("syscall" : "=a"(retval) : "a"(nr), "D"((u64)arg1) : - "rcx", "r11"); + "rcx", "r11", "memory"); return retval; } @@ -27,7 +27,7 @@ template static uptr internal_syscall(u64 nr, T1 arg1, T2 arg2) { u64 retval; asm volatile("syscall" : "=a"(retval) : "a"(nr), "D"((u64)arg1), - "S"((u64)arg2) : "rcx", "r11"); + "S"((u64)arg2) : "rcx", "r11", "memory"); return retval; } @@ -35,7 +35,7 @@ template
Re: build broken on ppc linux?!
On Nov 22, 2013, at 10:13 AM, Jakub Jelinek wrote: >> This is exactly the patch referenced in the pointer to the upstream repo. >> Arno, does this fix the build for you? >> >> Ok? > > Yes Committed revision 205285.
Re: proposal to make SIZE_TYPE more flexible
On Fri, 22 Nov 2013, DJ Delorie wrote: > > (more precisely, for int128_integer_type_node to cease to exist and > > for any front-end places needing it to call a function, with a type > > size that should not be a constant 128). > > The complications I've seen there is, for example, when you're > iterating through types looking for a "best" type, where some of the > types are fixed foo_type_node's and others are dynamic > intN_type_nodes. Perhaps we sould use a hybrid list-plus-table > approach? So we check for the standard types explicitly, then iterate > through the list of intN types? In general you need to analyze each such case individually to produce a reasoned argument for what it should logically be doing. Given such analyses, maybe then you can identify particular tables of types in particular orders (for example) that should be set up to iterate through. > > I can also believe it's appropriate for the global nodes for trees > > reflecting C ABI types to go somewhere other than tree.h. > > Which are those? Why isn't "int" one of those? I think "int" is one of them. Those files that have a need for C ABI types would include tree-c-abi.h. Optimizers that aren't e.g. generating calls to built-in functions where the C ABI is involved wouldn't include that header. As this should be orthogonal to your project it could just as well be part of the tree.h cleanup project. > > I've no idea whether a table-driven API for anything would be a good > > starting point. That depends on a detailed analysis of the current > > situation and its deficiencies for whatever you are proposing replacing > > with such an API. > > If you want to support more than one intN at a time, IMHO you need > more than just one intN object, hence a table of some sort. > > Or are you assuming that any given backend would only be allowed to > define one intN type? That's already not going to work, as I need > int20_t in addition to the "now standard" int128_t. I am saying that the starting point is understanding what is logically correct in the various different places dealing with integer types, and an analysis of that is what must drive any API design. Does the target with __int20 actually have __int128 (i.e. pass targetm.scalar_mode_supported_p (TImode))? But you should indeed be able to have an arbitrary number of such types. > > I *am* reasonably confident that the places handling hardcoded lists > > of intQI_type_node, intHI_type_node, ... would better iterate over > > whatever supported integer modes may be present in the particular > > compiler configuration (and have some set of signed / unsigned / > > atomic types associated with integer modes) rather than hardcoding a > > list. > > How is this different than the places handling hardcoded lists of > integer_type_node et al? (a) We already have the system for an arbitrary set of integer modes to be defined and iterated over, whereas the set of standard C types is target-independent. (b) This sort of thing tends to be more readily addressed through a series of small cleanup patches that clearly isolate anything that might possibly change behavior at all than through one huge patch. So any small obvious things are naturally separated out. (c) Iteration over C types has other complications such as preferences between different types (e.g. int and long) with the same middle-end properties. (d) I don't think the standard C types are particularly relevant to your project. -- Joseph S. Myers jos...@codesourcery.com
Re: proposal to make SIZE_TYPE more flexible
> In general you need to analyze each such case individually to produce a > reasoned argument for what it should logically be doing. Given such > analyses, maybe then you can identify particular tables of types in > particular orders (for example) that should be set up to iterate through. Ok, I'll do this next, except it's what I did first... and we still haven't decided how to handle some of those cases. I'll re-analyze. > Does the target with __int20 actually have __int128 (i.e. pass > targetm.scalar_mode_supported_p (TImode))? But you should indeed be able > to have an arbitrary number of such types. It doesn't support it, but it does *have* it. In that the compiler correcly parses the __int128 keyword and knows to tell you it isn't supported. So, it needs at least two keywords. Which implies "it needs two..." in other places. And it's reasonable to expect that *someone* will want int16, int32, etc types once a general solution is in place. > (d) I don't think the standard C types are particularly relevant to your > project. Should we be pulling the int128 support out of the integer_types[] list and put it in the global_trees[] (or some table) list? Because most of the int128 support is tied in with the standard C type handling, not the target-specific handling.
Question about CFLAGS/CXXFLAGS when building GCC
I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I want to build the compiler as a 32 bit executable. I thought the right way to do this was to do: export CFLAGS='-O2 -g -m32' export CXXFLAGS-'-O2 -g -m32' before running configure and make. This is working in that it created cc1 as a 32 bit executable like I wanted it to but when the build continues and builds libgcc, it uses CFLAGS when it is using the newly built gcc to compile libgcc. That is wrong because the GCC compiler that I just built (targeting MIPS) does not understand the -m32 flag and I don't want to override the options used when building the libraries anyway, only the options used to build executables. Am I setting the wrong CFLAGS/CXXFLAGS variables? Or is this a bug? Steve Ellcey sell...@mips.com
Re: Question about CFLAGS/CXXFLAGS when building GCC
On Fri, Nov 22, 2013 at 1:24 PM, Steve Ellcey wrote: > > I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I > want to build the compiler as a 32 bit executable. I thought the right way > to do this was to do: > > export CFLAGS='-O2 -g -m32' > export CXXFLAGS-'-O2 -g -m32' > > before running configure and make. > > This is working in that it created cc1 as a 32 bit executable like I wanted > it to but when the build continues and builds libgcc, it uses CFLAGS when > it is using the newly built gcc to compile libgcc. That is wrong because the > GCC compiler that I just built (targeting MIPS) does not understand the > -m32 flag and I don't want to override the options used when building the > libraries anyway, only the options used to build executables. > > Am I setting the wrong CFLAGS/CXXFLAGS variables? Or is this a bug? > Can you not touch CFLAGS/CXXFLAGS? Instead, you do # CC="gcc -m32" CXX="g++ -m32" .../configure # make CC="gcc -m32" CXX="g++ -m32" ... -- H.J.
Re: Question about CFLAGS/CXXFLAGS when building GCC
> I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I > want to build the compiler as a 32 bit executable. I thought the right way > to do this was to do: > > export CFLAGS='-O2 -g -m32' > export CXXFLAGS-'-O2 -g -m32' > > before running configure and make. > > This is working in that it created cc1 as a 32 bit executable like I wanted > it to but when the build continues and builds libgcc, it uses CFLAGS when > it is using the newly built gcc to compile libgcc. The usual way to do this is to set CC and CXX at the configure stage: CC="gcc -m32" CXX="g++ -m32" $(srcdir)/configure ... -- Eric Botcazou
Re: Question about CFLAGS/CXXFLAGS when building GCC
On Fri, 2013-11-22 at 13:48 -0800, H.J. Lu wrote: > On Fri, Nov 22, 2013 at 1:24 PM, Steve Ellcey wrote: > > > > I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I > > want to build the compiler as a 32 bit executable. I thought the right way > > to do this was to do: > > > > export CFLAGS='-O2 -g -m32' > > export CXXFLAGS-'-O2 -g -m32' > > > > before running configure and make. > > > > This is working in that it created cc1 as a 32 bit executable like I wanted > > it to but when the build continues and builds libgcc, it uses CFLAGS when > > it is using the newly built gcc to compile libgcc. That is wrong because > > the > > GCC compiler that I just built (targeting MIPS) does not understand the > > -m32 flag and I don't want to override the options used when building the > > libraries anyway, only the options used to build executables. > > > > Am I setting the wrong CFLAGS/CXXFLAGS variables? Or is this a bug? > > > > Can you not touch CFLAGS/CXXFLAGS? Instead, you do > > # CC="gcc -m32" CXX="g++ -m32" .../configure > # make CC="gcc -m32" CXX="g++ -m32" ... Doh. I don't know why that didn't occur to me. It should work and that is what I will do. Steve
Re: proposal to make SIZE_TYPE more flexible
On Fri, 22 Nov 2013, DJ Delorie wrote: > > Does the target with __int20 actually have __int128 (i.e. pass > > targetm.scalar_mode_supported_p (TImode))? But you should indeed be able > > to have an arbitrary number of such types. > > It doesn't support it, but it does *have* it. In that the compiler > correcly parses the __int128 keyword and knows to tell you it isn't > supported. So, it needs at least two keywords. Which implies "it > needs two..." in other places. Making __int20 and __int128 exactly similar indicates __int128 *not* being a keyword on targets not supporting it. > And it's reasonable to expect that *someone* will want int16, int32, > etc types once a general solution is in place. As previously noted, it's best only to define such types where (a) there is an integer mode passing targetm.scalar_mode_supported_p and (b) no standard C type matches, to avoid issues with whether __int32 is the same as int or not. > > (d) I don't think the standard C types are particularly relevant to your > > project. > > Should we be pulling the int128 support out of the integer_types[] > list and put it in the global_trees[] (or some table) list? Because > most of the int128 support is tied in with the standard C type > handling, not the target-specific handling. My guess is that the int128 support doesn't belong in any of the existing global arrays, but in some new arrays supporting whatever set of intN types the target has. That's just a guess; whether you follow or don't follow it, your analysis of the code needs to justify your choice. -- Joseph S. Myers jos...@codesourcery.com