RE: rs6000: load_multiple code

2013-11-22 Thread Paulo Matos


Paulo Matos


> -Original Message-
> From: Alan Modra [mailto:amo...@gmail.com]
> Sent: 22 November 2013 04:42
> To: Paulo Matos
> Cc: gcc@gcc.gnu.org
> Subject: Re: rs6000: load_multiple code
> 
> On Wed, Nov 20, 2013 at 05:06:13PM +, Paulo Matos wrote:
> > I am looking into how rs6000 implements load multiple code
> [snip]
> 
> No pseudos are involved for the destination.  See the FAIL in
> rs6000.md load_multiple.

Right, I missed that bit:
if (...
|| REGNO (operands[0]) >= 32)
  FAIL;

This will basically never match at expand time then, and will have little, if 
any, use before register allocation then. Right?

> 
> --
> Alan Modra
> Australia Development Lab, IBM



Re: [doc] Fixing reference inside Extended-Asm.html

2013-11-22 Thread dw



Is the version of texinfo buggy to generate online documentation?


Sorry for the delayed response.  I was hoping the gcc expert on docs 
would respond so I could see who that was.


I have been doing some work on Extended-Asm.html (see the work in 
progress at 
http://www.limegreensocks.com/gcc/Using-Assembly-Language-with-C.html) 
and I haven't had a problem generating output.


dw


RE: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Bingfeng Mei
Well, in your modified example, it is still due to jump threading that produce
code of bad control flow that cannot be if-converted and vectorized, though in
tree-vrp pass this time. 

Try this 
~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
-ftree-vectorize  -save-temps -fno-tree-vrp

The code can be vectorized. 

Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
apply
jump threading. Any other place? I think we need an target hook to control it. 

Thanks,
Bingfeng

-Original Message-
From: Andrew Pinski [mailto:pins...@gmail.com] 
Sent: 21 November 2013 21:26
To: Bingfeng Mei
Cc: gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei  wrote:
> Hi,
> I am doing some investigation on loops can be vectorized
> by LLVM, but not GCC. One example is loop that contains
> more than one if-else constructs.
>
> typedef signed char int8;
> #define FFT 128
>
> typedef struct {
> int8   exp[FFT];
> } feq_t;
>
> void test(feq_t *feq)
> {
> int k;
> int feqMinimum = 15;
> int8 *exp = feq->exp;
>
> for (k=0;k exp[k] -= feqMinimum;
> if(exp[k]<-15) exp[k] = -15;
> if(exp[k]>15) exp[k]  = 15;
> }
> }
>
> Compile it with 4.8.2 on x86_64
> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps
>
> It is not vectorized because if-else constructs are not properly
> if-converted. Looking into .ifcvt file, I found the loop is not
> if-converted because of bad if-else structure. One branch jumps directly
> into another branch. Digging a bit deeper, I found such structure
> is generated by dom1 pass doing jump threading optimization.
> So recompile with
>
> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-dominator-opts
>
> It is magically if-converted and vectorized! Same on our target,
> performance is improved greatly in this example.
>
> It seems to me that doing jump threading for architectures
> support if-conversion is not a good idea. Original if-else structures
> are damaged so that if-conversion cannot proceed, so are vectorization
> and maybe other optimizations. Should we try to identify those "bad"
> jump threading and skip them for such architectures?

This is not a bad jump threading at all.  In fact I think this is just
a misoptimization exposed by DOM.  Rewriting it like:
#define FFT 128

typedef struct {
signed char   exp[FFT];
} feq_t;

void test(feq_t *feq)
{
int k;
int feqMinimum = 15;
signed char *exp = feq->exp;

for (k=0;k15) temp  = 15;
exp[k] = temp;
}
}

--- CUT 
Also shows the issue even without any jump threading involved (turning
off DOM does not fix my example).  Please file a bug with both your
and my examples.

Also what DOM is doing is getting rid of the extra store to exp[k] in
some cases.


>
> Bingfeng Mei
> Broadcom UK
>
>
>



Re: build broken on ppc linux?!

2013-11-22 Thread Richard Biener
On Fri, Nov 22, 2013 at 1:57 AM, Jonathan Wakely  wrote:
> On 21 November 2013 21:17, Peter Bergner wrote:
>> On Thu, 2013-11-21 at 16:03 -0500, David Edelsohn wrote:
>>> Looks like another issue for the libsanitizer maintainers.
>>
>> I've been doing bootstraps, but didn't see this because the
>> kernel header linux/vt.h use on the RHEL6 system I was doing
>> builds on has that field renamed.  Looking at our SLES11
>> devel system I do see the problematic header file.
>
> Yes, it only seems to be a problem with SUSE kernels:
> http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html

As my bugreport is being ignored it would help if one ouf our
partners (hint! hint!) would raise this issue via the appropriate
channel ;)

Richard.


Re: build broken on ppc linux?!

2013-11-22 Thread Arnaud Charlet
> >>> Looks like another issue for the libsanitizer maintainers.
> >>
> >> I've been doing bootstraps, but didn't see this because the
> >> kernel header linux/vt.h use on the RHEL6 system I was doing
> >> builds on has that field renamed.  Looking at our SLES11
> >> devel system I do see the problematic header file.
> >
> > Yes, it only seems to be a problem with SUSE kernels:
> > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html
> 
> As my bugreport is being ignored it would help if one ouf our
> partners (hint! hint!) would raise this issue via the appropriate
> channel ;)

BTW I do not know if this is related, but my build of GCC is stuck
currently with this error message:

<<
/users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
Assembler messages:
/users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821: 
Error: .cfi_endproc without corresponding .cfi_startproc
:21485: Error: open CFI at the end of file; missing .cfi_endproc directive
make[4]: *** [sanitizer_linux.lo] Error 1
>>

Would appreciate a fix/work around!

Arno


GCC 4.9.0 Status Report (2013-11-22), Trunk in Stage 3 NOW

2013-11-22 Thread Richard Biener

Status
==

The trunk is now in Stage 3.  To repeat what that means: the trunk
is open for general bugfixing, no new features should be added
at this point.  For exceptions consult your friendly release
managers.

We have been in Stage 1 for 8 months now.  Now is time to look 
into one of the gazillion regressions we have accumulated.
After about two months of general bugfixing the trunk will go
into Stage 4 aka "branch state" where only regression and
documentation fixes are allowed.  When we reach the requirement
for a release candidate, which is to end up with zero P1 bugs,
the 4.9 branch will be created and Stage 1 will open again.

I have re-prioritized regressions, so here is quality data
with a delta from the current 4.8 branch (trying out sth new ...).


Quality Data


Priority  #   Change from 4.8 branch status
---   ---
P1   63+ 63
P2  136   +-  0
P3   11- 14
P4   88+  2
P5   60+  7
---   ---
Total   358+ 58



Previous Report
===

http://gcc.gnu.org/ml/gcc/2013-10/msg00224.html


The next report will be sent when we leave Stage 3.


Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread James Greenhalgh
On Fri, Nov 22, 2013 at 11:03:22AM +, Bingfeng Mei wrote:
> Well, in your modified example, it is still due to jump threading that produce
> code of bad control flow that cannot be if-converted and vectorized, though in
> tree-vrp pass this time. 
> 
> Try this 
> ~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-vrp
> 
> The code can be vectorized. 
> 
> Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
> apply
> jump threading. Any other place? I think we need an target hook to control 
> it. 
> 

You can effectively disable jump-threading using:
  --param max-jump-thread-duplication-stmts=0

(grep dump files for "Jumps threaded")

I don't see Andrew's code vectorized even with jump-threading disabled
so I think Andrew is correct and this is some other missed optimization.

James



Re: build broken on ppc linux?!

2013-11-22 Thread Eric Botcazou
> Would appreciate a fix/work around!

Configure with --disable-libsanitizer.

-- 
Eric Botcazou


Re: build broken on ppc linux?!

2013-11-22 Thread Arnaud Charlet
> > Would appreciate a fix/work around!
> 
> Configure with --disable-libsanitizer.

Will do, thanks.


Re: build broken on ppc linux?!

2013-11-22 Thread Jakub Jelinek
On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote:
> > >>> Looks like another issue for the libsanitizer maintainers.
> > >>
> > >> I've been doing bootstraps, but didn't see this because the
> > >> kernel header linux/vt.h use on the RHEL6 system I was doing
> > >> builds on has that field renamed.  Looking at our SLES11
> > >> devel system I do see the problematic header file.
> > >
> > > Yes, it only seems to be a problem with SUSE kernels:
> > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html
> > 
> > As my bugreport is being ignored it would help if one ouf our
> > partners (hint! hint!) would raise this issue via the appropriate
> > channel ;)
> 
> BTW I do not know if this is related, but my build of GCC is stuck
> currently with this error message:
> 
> <<
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
> Assembler messages:
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821:
>  Error: .cfi_endproc without corresponding .cfi_startproc
> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive
> make[4]: *** [sanitizer_linux.lo] Error 1
> >>
> 
> Would appreciate a fix/work around!

I guess something like this could fix this.
Though, no idea if clang has any similar macro, or if llvm always
uses .cfi_* directives, or what.  Certainly for GCC, if
__GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them
(either as doesn't support them, or gcc simply hasn't been configured to use
them, etc.).  In that case GCC emits .eh_frame by hand, and it isn't really
possible to tweak that.  Kostya?

--- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 
11:31:00.154740857 +0100
+++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 
12:50:50.107420695 +0100
@@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v
 *%r8  = new_tls,
 *%r10 = child_tidptr)
 */
+#ifdef __GCC_HAVE_DWARF2_CFI_ASM
".cfi_endproc\n"
+#endif
"syscall\n"
 
/* if (%rax != 0)
@@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v
"jnz1f\n"
 
/* In the child. Terminate unwind chain. */
+#ifdef __GCC_HAVE_DWARF2_CFI_ASM
".cfi_startproc\n"
".cfi_undefined %%rip;\n"
+#endif
"xorq   %%rbp,%%rbp\n"
 
/* Call "fn(arg)". */

Jakub


RE: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Bingfeng Mei
Yes, it can be vectorized with your suggestion.

~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
-ftree-vectorize  -save-temps --param max-jump-thread-duplication-stmts=0

See attached assemble file.

Bingfeng


-Original Message-
From: James Greenhalgh [mailto:james.greenha...@arm.com] 
Sent: 22 November 2013 11:50
To: Bingfeng Mei
Cc: Andrew Pinski; gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On Fri, Nov 22, 2013 at 11:03:22AM +, Bingfeng Mei wrote:
> Well, in your modified example, it is still due to jump threading that produce
> code of bad control flow that cannot be if-converted and vectorized, though in
> tree-vrp pass this time. 
> 
> Try this 
> ~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-vrp
> 
> The code can be vectorized. 
> 
> Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
> apply
> jump threading. Any other place? I think we need an target hook to control 
> it. 
> 

You can effectively disable jump-threading using:
  --param max-jump-thread-duplication-stmts=0

(grep dump files for "Jumps threaded")

I don't see Andrew's code vectorized even with jump-threading disabled
so I think Andrew is correct and this is some other missed optimization.

James




vect-ifconv-2.s
Description: vect-ifconv-2.s


Re: build broken on ppc linux?!

2013-11-22 Thread Konstantin Serebryany
> As my bugreport is being ignored it would help if one ouf our

Sorry. Which one?

> partners (hint! hint!) would raise this issue via the appropriate
> channel ;)
>
> Richard.


Re: build broken on ppc linux?!

2013-11-22 Thread Konstantin Serebryany
On Fri, Nov 22, 2013 at 3:56 PM, Jakub Jelinek  wrote:
> On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote:
>> > >>> Looks like another issue for the libsanitizer maintainers.
>> > >>
>> > >> I've been doing bootstraps, but didn't see this because the
>> > >> kernel header linux/vt.h use on the RHEL6 system I was doing
>> > >> builds on has that field renamed.  Looking at our SLES11
>> > >> devel system I do see the problematic header file.
>> > >
>> > > Yes, it only seems to be a problem with SUSE kernels:
>> > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html
>> >
>> > As my bugreport is being ignored it would help if one ouf our
>> > partners (hint! hint!) would raise this issue via the appropriate
>> > channel ;)
>>
>> BTW I do not know if this is related, but my build of GCC is stuck
>> currently with this error message:
>>
>> <<
>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
>> Assembler messages:
>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821:
>>  Error: .cfi_endproc without corresponding .cfi_startproc
>> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive
>> make[4]: *** [sanitizer_linux.lo] Error 1
>> >>
>>
>> Would appreciate a fix/work around!
>
> I guess something like this could fix this.
> Though, no idea if clang has any similar macro, or if llvm always
> uses .cfi_* directives, or what.  Certainly for GCC, if
> __GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them
> (either as doesn't support them, or gcc simply hasn't been configured to use
> them, etc.).  In that case GCC emits .eh_frame by hand, and it isn't really
> possible to tweak that.  Kostya?
>
> --- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 
> 11:31:00.154740857 +0100
> +++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 
> 12:50:50.107420695 +0100
> @@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v
>  *%r8  = new_tls,
>  *%r10 = child_tidptr)
>  */
> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM
> ".cfi_endproc\n"
> +#endif
> "syscall\n"
>
> /* if (%rax != 0)
> @@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v
> "jnz1f\n"
>
> /* In the child. Terminate unwind chain. */
> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM
> ".cfi_startproc\n"
> ".cfi_undefined %%rip;\n"
> +#endif
> "xorq   %%rbp,%%rbp\n"
>
> /* Call "fn(arg)". */

These CFI directives were completely removed in upstream at
http://llvm.org/viewvc/llvm-project?rev=192196&view=rev
Strangely, this did not get into the last merge...

Anyway, these cfi_* will (should, at least) disappear with the next
merge which I hope to do in ~ 1 week.
(Or anyone is welcome to delete these now as a separate commit, but
please make sure the code matches the one in upstream)

--kcc


>
> Jakub


Re: build broken on ppc linux?!

2013-11-22 Thread Konstantin Serebryany
On Fri, Nov 22, 2013 at 4:31 PM, Konstantin Serebryany
 wrote:
> On Fri, Nov 22, 2013 at 3:56 PM, Jakub Jelinek  wrote:
>> On Fri, Nov 22, 2013 at 12:47:17PM +0100, Arnaud Charlet wrote:
>>> > >>> Looks like another issue for the libsanitizer maintainers.
>>> > >>
>>> > >> I've been doing bootstraps, but didn't see this because the
>>> > >> kernel header linux/vt.h use on the RHEL6 system I was doing
>>> > >> builds on has that field renamed.  Looking at our SLES11
>>> > >> devel system I do see the problematic header file.
>>> > >
>>> > > Yes, it only seems to be a problem with SUSE kernels:
>>> > > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html
>>> >
>>> > As my bugreport is being ignored it would help if one ouf our
>>> > partners (hint! hint!) would raise this issue via the appropriate
>>> > channel ;)
>>>
>>> BTW I do not know if this is related, but my build of GCC is stuck
>>> currently with this error message:
>>>
>>> <<
>>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
>>> Assembler messages:
>>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821:
>>>  Error: .cfi_endproc without corresponding .cfi_startproc
>>> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive
>>> make[4]: *** [sanitizer_linux.lo] Error 1
>>> >>
>>>
>>> Would appreciate a fix/work around!
>>
>> I guess something like this could fix this.
>> Though, no idea if clang has any similar macro, or if llvm always
>> uses .cfi_* directives, or what.  Certainly for GCC, if
>> __GCC_HAVE_DWARF2_CFI_ASM isn't defined, then GCC doesn't emit them
>> (either as doesn't support them, or gcc simply hasn't been configured to use
>> them, etc.).  In that case GCC emits .eh_frame by hand, and it isn't really
>> possible to tweak that.  Kostya?
>>
>> --- libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-12 
>> 11:31:00.154740857 +0100
>> +++ libsanitizer/sanitizer_common/sanitizer_linux.cc2013-11-22 
>> 12:50:50.107420695 +0100
>> @@ -785,7 +785,9 @@ uptr internal_clone(int (*fn)(void *), v
>>  *%r8  = new_tls,
>>  *%r10 = child_tidptr)
>>  */
>> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM
>> ".cfi_endproc\n"
>> +#endif
>> "syscall\n"
>>
>> /* if (%rax != 0)
>> @@ -795,8 +797,10 @@ uptr internal_clone(int (*fn)(void *), v
>> "jnz1f\n"
>>
>> /* In the child. Terminate unwind chain. */
>> +#ifdef __GCC_HAVE_DWARF2_CFI_ASM
>> ".cfi_startproc\n"
>> ".cfi_undefined %%rip;\n"
>> +#endif
>> "xorq   %%rbp,%%rbp\n"
>>
>> /* Call "fn(arg)". */
>
> These CFI directives were completely removed in upstream at
> http://llvm.org/viewvc/llvm-project?rev=192196&view=rev
> Strangely, this did not get into the last merge...

Ah, no surprise.
The merge was done from llvm's r191666, which is earlier than 192196


>
> Anyway, these cfi_* will (should, at least) disappear with the next
> merge which I hope to do in ~ 1 week.
> (Or anyone is welcome to delete these now as a separate commit, but
> please make sure the code matches the one in upstream)
>
> --kcc
>
>
>>
>> Jakub


Re: build broken on ppc linux?!

2013-11-22 Thread Martin Jambor
On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote:
> > As my bugreport is being ignored it would help if one ouf our
> 
> Sorry. Which one?

I believe richi meant
https://bugzilla.novell.com/show_bug.cgi?id=849180

Martin

> 
> > partners (hint! hint!) would raise this issue via the appropriate
> > channel ;)

:-)


Re: build broken on ppc linux?!

2013-11-22 Thread Konstantin Serebryany
On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor  wrote:
> On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote:
>> > As my bugreport is being ignored it would help if one ouf our
>>
>> Sorry. Which one?
>
> I believe richi meant
> https://bugzilla.novell.com/show_bug.cgi?id=849180

I don't have access there.


Re: proposal to make SIZE_TYPE more flexible

2013-11-22 Thread Joseph S. Myers
On Fri, 22 Nov 2013, DJ Delorie wrote:

> If I come up with some table-driven API to register
> "integer-like-types" and search/sort/choose from them, would that be a
> good starting point?  Then we can #define *_type_node to a function
> call perhaps.

I am doubtful that it's appropriate for e.g. integer_type_node to be a 
function call.  I can believe it makes sense for int128_integer_type_node 
to be such a call (more precisely, for int128_integer_type_node to cease 
to exist and for any front-end places needing it to call a function, with 
a type size that should not be a constant 128).  I can also believe it's 
appropriate for the global nodes for trees reflecting C ABI types to go 
somewhere other than tree.h.

I've no idea whether a table-driven API for anything would be a good 
starting point.  That depends on a detailed analysis of the current 
situation and its deficiencies for whatever you are proposing replacing 
with such an API.

I *am* reasonably confident that the places handling hardcoded lists of 
intQI_type_node, intHI_type_node, ... would better iterate over whatever 
supported integer modes may be present in the particular compiler 
configuration (and have some set of signed / unsigned / atomic types 
associated with integer modes) rather than hardcoding a list.

It would not surprise me if some of the global type nodes either aren't 
needed at all or, being only used for built-in functions, should actually 
be defined in builtin-types.def rather than tree.[ch].  For example, 
complex_integer_type_node and float_ptr_type_node.  But I don't think 
cleaning up those would actually help in any way towards your goal; it 
would be a completely orthogonal cleanup.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Richard Biener
On Fri, Nov 22, 2013 at 12:03 PM, Bingfeng Mei  wrote:
> Well, in your modified example, it is still due to jump threading that produce
> code of bad control flow that cannot be if-converted and vectorized, though in
> tree-vrp pass this time.
>
> Try this
> ~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-vrp
>
> The code can be vectorized.
>
> Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
> apply
> jump threading. Any other place? I think we need an target hook to control it.

Surely not.  It's just the usual phase ordering issue that cannot be avoided
in all cases.  Fix if-conversion instead.

Richard.

> Thanks,
> Bingfeng
>
> -Original Message-
> From: Andrew Pinski [mailto:pins...@gmail.com]
> Sent: 21 November 2013 21:26
> To: Bingfeng Mei
> Cc: gcc@gcc.gnu.org
> Subject: Re: Jump threading in tree dom pass prevents if-conversion & 
> following vectorization
>
> On Thu, Nov 21, 2013 at 7:11 AM, Bingfeng Mei  wrote:
>> Hi,
>> I am doing some investigation on loops can be vectorized
>> by LLVM, but not GCC. One example is loop that contains
>> more than one if-else constructs.
>>
>> typedef signed char int8;
>> #define FFT 128
>>
>> typedef struct {
>> int8   exp[FFT];
>> } feq_t;
>>
>> void test(feq_t *feq)
>> {
>> int k;
>> int feqMinimum = 15;
>> int8 *exp = feq->exp;
>>
>> for (k=0;k> exp[k] -= feqMinimum;
>> if(exp[k]<-15) exp[k] = -15;
>> if(exp[k]>15) exp[k]  = 15;
>> }
>> }
>>
>> Compile it with 4.8.2 on x86_64
>> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
>> -ftree-vectorize  -save-temps
>>
>> It is not vectorized because if-else constructs are not properly
>> if-converted. Looking into .ifcvt file, I found the loop is not
>> if-converted because of bad if-else structure. One branch jumps directly
>> into another branch. Digging a bit deeper, I found such structure
>> is generated by dom1 pass doing jump threading optimization.
>> So recompile with
>>
>> ~/install-4.8/bin/gcc ghs-algorithms_380.c -O2 -fdump-tree-ifcvt-details 
>> -ftree-vectorize  -save-temps -fno-tree-dominator-opts
>>
>> It is magically if-converted and vectorized! Same on our target,
>> performance is improved greatly in this example.
>>
>> It seems to me that doing jump threading for architectures
>> support if-conversion is not a good idea. Original if-else structures
>> are damaged so that if-conversion cannot proceed, so are vectorization
>> and maybe other optimizations. Should we try to identify those "bad"
>> jump threading and skip them for such architectures?
>
> This is not a bad jump threading at all.  In fact I think this is just
> a misoptimization exposed by DOM.  Rewriting it like:
> #define FFT 128
>
> typedef struct {
> signed char   exp[FFT];
> } feq_t;
>
> void test(feq_t *feq)
> {
> int k;
> int feqMinimum = 15;
> signed char *exp = feq->exp;
>
> for (k=0;k signed char temp = exp[k] - feqMinimum;
> if(temp<-15) temp = -15;
> if(temp>15) temp  = 15;
> exp[k] = temp;
> }
> }
>
> --- CUT 
> Also shows the issue even without any jump threading involved (turning
> off DOM does not fix my example).  Please file a bug with both your
> and my examples.
>
> Also what DOM is doing is getting rid of the extra store to exp[k] in
> some cases.
>
>
>>
>> Bingfeng Mei
>> Broadcom UK
>>
>>
>>
>


Re: build broken on ppc linux?!

2013-11-22 Thread Richard Biener
On Fri, Nov 22, 2013 at 1:36 PM, Konstantin Serebryany
 wrote:
> On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor  wrote:
>> On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote:
>>> > As my bugreport is being ignored it would help if one ouf our
>>>
>>> Sorry. Which one?
>>
>> I believe richi meant
>> https://bugzilla.novell.com/show_bug.cgi?id=849180
>
> I don't have access there.

The hint was directed at the IBM people.

Richard.


Re: rs6000: load_multiple code

2013-11-22 Thread Alan Modra
On Fri, Nov 22, 2013 at 09:31:18AM +, Paulo Matos wrote:
> > From: Alan Modra [mailto:amo...@gmail.com]
> > On Wed, Nov 20, 2013 at 05:06:13PM +, Paulo Matos wrote:
> > > I am looking into how rs6000 implements load multiple code
> > [snip]
> > 
> > No pseudos are involved for the destination.  See the FAIL in
> > rs6000.md load_multiple.
> 
> Right, I missed that bit:
> if (...
> || REGNO (operands[0]) >= 32)
>   FAIL;
> 
> This will basically never match at expand time then, and will have little, if 
> any, use before register allocation then. Right?

Right.  You'll find store_multiple used in function prologues and
load_multiple in epilogues, with -Os if the target supports the string
insns.  movmemsi is of more interest in code elsewhere, and you'll see
a comment there about the register allocator.  :)

-- 
Alan Modra
Australia Development Lab, IBM


Re: build broken on ppc linux?!

2013-11-22 Thread Martin Jambor
Hi,

On Fri, Nov 22, 2013 at 04:36:47PM +0400, Konstantin Serebryany wrote:
> On Fri, Nov 22, 2013 at 4:35 PM, Martin Jambor  wrote:
> > On Fri, Nov 22, 2013 at 04:19:26PM +0400, Konstantin Serebryany wrote:
> >> > As my bugreport is being ignored it would help if one ouf our
> >>
> >> Sorry. Which one?
> >
> > I believe richi meant
> > https://bugzilla.novell.com/show_bug.cgi?id=849180
> 
> I don't have access there.

Sorry, although I thought checked I the bug was public, apparently I
did it wrong and it is not.  Anyway, it is bug against SLES 11 which
does not have the kernel patch to make vt.h C++ compilable.

Martin


Re: build broken on ppc linux?!

2013-11-22 Thread Peter Bergner
On Fri, 2013-11-22 at 12:30 +0100, Richard Biener wrote:
> On Fri, Nov 22, 2013 at 1:57 AM, Jonathan Wakely  
> wrote:
> > Yes, it only seems to be a problem with SUSE kernels:
> > http://gcc.gnu.org/ml/gcc/2013-11/msg00090.html
> 
> As my bugreport is being ignored it would help if one ouf our
> partners (hint! hint!) would raise this issue via the appropriate
> channel ;)

Ok, I'll open a bug on our side and we'll see if that helps
move things along.

Peter




Re: build broken on ppc linux?!

2013-11-22 Thread FX
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
> Assembler messages:
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821:
>  Error: .cfi_endproc without corresponding .cfi_startproc
> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive
> make[4]: *** [sanitizer_linux.lo] Error 1

I’ve posted this to the list before, and turns out you need “recent” linux 
kernel and “recent” binutils to bootstrap GCC these days. But to keep the fun, 
“recent” is neither document, nor tested at configure time, so you end up with 
useless error messages.

I’ve filed bug reports about it 
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59067 and 
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068), which have been dutifuly 
ignored. My opinion is that unless the level of support of libsanitizer is 
increased, it should not be built by default (or build it only if it’s 
supported). Causing such bootstrap issues would not be tolerated in other parts 
of the compiler.

FX

Re: build broken on ppc linux?!

2013-11-22 Thread Konstantin Serebryany
On Fri, Nov 22, 2013 at 7:00 PM, FX  wrote:
>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: 
>> Assembler messages:
>> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:821:
>>  Error: .cfi_endproc without corresponding .cfi_startproc
>> :21485: Error: open CFI at the end of file; missing .cfi_endproc directive
>> make[4]: *** [sanitizer_linux.lo] Error 1
>
> I’ve posted this to the list before, and turns out you need “recent” linux 
> kernel and “recent” binutils to bootstrap GCC these days. But to keep the 
> fun, “recent” is neither document, nor tested at configure time, so you end 
> up with useless error messages.
>
> I’ve filed bug reports about it 
> (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59067 and 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59068), which have been dutifuly 
> ignored. My opinion is that unless the level of support of libsanitizer is 
> increased, it should not be built by default (or build it only if it’s 
> supported). Causing such bootstrap issues would not be tolerated in other 
> parts of the compiler.

I am all for disabling libsanitizer if something in the tool chain is
old (binutils, kernel, compiler, etc).

>
> FX


RE: cross compile & exceptions

2013-11-22 Thread BELBACHIR Selim
I did this in order to build gcc, libgcc and libstdc++ independently.


when I do the simple integrated build process (following 
http://gcc.gnu.org/install)  :

cd $(GCC_OBJDIR); CFLAGS="-g -O0" $(GCC_SRCDIR)/configure 
-quiet 
--prefix=$(INSTALLDIR) 
--target=$(TARGET) 
--enable-languages=c,c++,ada 
--disable-nls 
--disable-decimal-float 
--disable-fixed-point 
--disable-libmudflap 
--disable-libffi 
--disable-libssp 
--disable-shared 
--disable-threads 
--without-headers
--disable-libada 
--enable-version-specific-runtime-lib 
--disable-bootstrap
--enable-checking=release
make -C $(GCC_OBJDIR)



I encounter a problem on libstdc++v3 :

Configuring in prism/libstdc++-v3
Configuring in prism/libiberty
configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES.
make[2]: *** [configure-target-libiberty] Error 1
make[2]: *** Waiting for unfinished jobs
configure: error: Link tests are not allowed after GCC_NO_EXECUTABLES.
make[2]: *** [configure-target-libstdc++-v3] Error 1
make[1]: *** [all] Error 2
make: *** [gcc_make] Error 2

because stdio.h is not found  (my libc is externally built and --without-header 
prevent gcc from knowing where are these headers)



I tried --with-headers with my own libc header files (incomplete home made 
libc)  but this time I found stuck on libgcc2 requiring unistd.h that I don't 
have (or want) :

In file included from 
/vues_statiques/FPGA/belbachir/prism2/MPUCores/tools/gcc-4.5.2/libgcc/../gcc/libgcc2.c:29:0:
/vues_statiques/FPGA/belbachir/prism2/MPUCores/tools/gcc-4.5.2/libgcc/../gcc/tsystem.h:102:20:
 fatal error: unistd.h: No such file or directory
compilation terminated.



So, to build libgcc I would need --without-header to compensate for my small 
libc, and to build libstdc++ I would have to use --with-header in order to 
provide stdio.h ...


Do you know a better way to solve that than building gcc, libgcc & libstdc++ 
independently ?




post_inc mem in parallel rtx

2013-11-22 Thread BELBACHIR Selim
Hi,

I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3) 

/* The register should have been invalidated.  */
  gcc_assert (REG_VALUES (dreg)->elt == 0);<<== 
assert(false)


I investigated the dump and found that the crash occurred during 207r.dse2 pass.

Here is what I saw in the previous dump (206r.pro_and_epilogue) :

(insn 104 47 105 7 (parallel [
(set (reg:CC_NOOV 56 $CCI)
(compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] 
[133])
(mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 
ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))
(const_int 0 [0])))
(set (reg:SI 16 $R0 [153])
(minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133])
(mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] 
[140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])))
])

Note the post_inc MEM on $R2 appearing twice

This rtl match my pattern (predicate and contraint ok) below :

(define_insn "subsi3_compare0"
  [(set (reg:CC_NOOV CCI_REG)
(compare:CC_NOOV
  (minus:SI 
(match_operand:SI 1 "general_operand" "g")
(match_operand:SI 2 " general_operand " " g"))
  (const_int 0)))
   (set (match_operand:SI 0 "register_operand" "=r ")
(minus:SI
  (match_dup 1)
  (match_dup 2)))]

But I think It may be an error to authorize post_inc MEM in this parallel rtx 
in operand 1 & 2. 
When I put a more restrictive constraint which forbid the use of post_inc, the 
crash in cselib.c disappear.

Question : What does GCC understand when the md describes a pattern allowing 
the same post_inc MEM in 2 slot of a parallel rtx ?
Is it forbidden ? the MEM address is supposed to be incremented twice ?

Regards,

Selim



Re: cross compile & exceptions

2013-11-22 Thread Ian Lance Taylor
On Fri, Nov 22, 2013 at 8:43 AM, BELBACHIR Selim
 wrote:
> I did this in order to build gcc, libgcc and libstdc++ independently.

OK, fair enough.

Sorry, I don't know what is happening with your original bug report.

Ian


Re: cross compile & exceptions

2013-11-22 Thread Andrew Haley
On 11/22/2013 04:43 PM, BELBACHIR Selim wrote:

> 
> So, to build libgcc I would need --without-header to compensate for my small 
> libc, and to build libstdc++ I would have to use --with-header in order to 
> provide stdio.h ...
> 
> 
> Do you know a better way to solve that than building gcc, libgcc & libstdc++ 
> independently ?

What is $(TARGET) ?

Andrew.



RE: cross compile & exceptions

2013-11-22 Thread BELBACHIR Selim
>> 
>> So, to build libgcc I would need --without-header to compensate for my small 
>> libc, and to build libstdc++ I would have to use --with-header in order to 
>> provide stdio.h ...
>> 
>> 
>> Do you know a better way to solve that than building gcc, libgcc & libstdc++ 
>> independently ?

> What is $(TARGET) ?

>Andrew.

$(TARGET) is a private embedded platform (cpu/os/lib)

Selim



Re: post_inc mem in parallel rtx

2013-11-22 Thread Jeff Law

On 11/22/13 09:43, BELBACHIR Selim wrote:

Hi,

I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3)

 /* The register should have been invalidated.  */
   gcc_assert (REG_VALUES (dreg)->elt == 0);<<== 
assert(false)


I investigated the dump and found that the crash occurred during 207r.dse2 pass.

Here is what I saw in the previous dump (206r.pro_and_epilogue) :

(insn 104 47 105 7 (parallel [
 (set (reg:CC_NOOV 56 $CCI)
 (compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] 
[133])
 (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 
ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))
 (const_int 0 [0])))
 (set (reg:SI 16 $R0 [153])
 (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133])
 (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 ] 
[140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])))
 ])

Note the post_inc MEM on $R2 appearing twice

This rtl match my pattern (predicate and contraint ok) below :

(define_insn "subsi3_compare0"
   [(set (reg:CC_NOOV CCI_REG)
 (compare:CC_NOOV
   (minus:SI
 (match_operand:SI 1 "general_operand" "g")
 (match_operand:SI 2 " general_operand " " g"))
   (const_int 0)))
(set (match_operand:SI 0 "register_operand" "=r ")
 (minus:SI
   (match_dup 1)
   (match_dup 2)))]

But I think It may be an error to authorize post_inc MEM in this parallel rtx in 
operand 1 & 2.
When I put a more restrictive constraint which forbid the use of post_inc, the 
crash in cselib.c disappear.

Question : What does GCC understand when the md describes a pattern allowing 
the same post_inc MEM in 2 slot of a parallel rtx ?
Is it forbidden ? the MEM address is supposed to be incremented twice ?
I think the semantics are defined by the PARALLEL.  Namely that the uses 
are evaluated, then side effects are performed.  So both sets use the 
value before incrementing.


The only question is what is the resulting value, and given the 
fundamental nature of PARALLEL, I think a single visible side effect is 
the most obvious answer.


Now having said that, there's a distinct possibility various passes 
don't handle this properly.


jeff


Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Jeff Law

On 11/22/13 04:03, Bingfeng Mei wrote:

Well, in your modified example, it is still due to jump threading that produce
code of bad control flow that cannot be if-converted and vectorized, though in
tree-vrp pass this time.

Try this
~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
-ftree-vectorize  -save-temps -fno-tree-vrp

The code can be vectorized.

Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
apply
jump threading. Any other place? I think we need an target hook to control it.

No no.  The right thing to do is fix if-conversion.

jeff



RE: post_inc mem in parallel rtx

2013-11-22 Thread BELBACHIR Selim
Ok so I should avoid the auto_inc alternatives in PARALLEL. It's certainly a 
quite rare RTL and I doubt the effort worth it.


-Message d'origine-
De : Jeff Law [mailto:l...@redhat.com] 
Envoyé : vendredi 22 novembre 2013 17:55
À : BELBACHIR Selim; gcc@gcc.gnu.org
Objet : Re: post_inc mem in parallel rtx

On 11/22/13 09:43, BELBACHIR Selim wrote:
> Hi,
>
> I encountered a bug in cselib.c:2360 using gnat7.1.2 (gcc4.7.3)
>
>  /* The register should have been invalidated.  */
>gcc_assert (REG_VALUES (dreg)->elt == 0);<<== 
> assert(false)
>
>
> I investigated the dump and found that the crash occurred during 207r.dse2 
> pass.
>
> Here is what I saw in the previous dump (206r.pro_and_epilogue) :
>
> (insn 104 47 105 7 (parallel [
>  (set (reg:CC_NOOV 56 $CCI)
>  (compare:CC_NOOV (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 
> ] [133])
>  (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 
> ivtmp.363 ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32]))
>  (const_int 0 [0])))
>  (set (reg:SI 16 $R0 [153])
>  (minus:SI (reg/f:SI 22 $R6 [orig:133 D.3274 ] [133])
>  (mem/f:SI (post_inc:SI (reg:SI 2 $R2 [orig:140 ivtmp.363 
> ] [140])) [0 MEM[base: D.4517_59, offset: 0B]+0 S4 A32])))
>  ])
>
> Note the post_inc MEM on $R2 appearing twice
>
> This rtl match my pattern (predicate and contraint ok) below :
>
> (define_insn "subsi3_compare0"
>[(set (reg:CC_NOOV CCI_REG)
>  (compare:CC_NOOV
>(minus:SI
>  (match_operand:SI 1 "general_operand" "g")
>  (match_operand:SI 2 " general_operand " " g"))
>(const_int 0)))
> (set (match_operand:SI 0 "register_operand" "=r ")
>  (minus:SI
>(match_dup 1)
>(match_dup 2)))]
>
> But I think It may be an error to authorize post_inc MEM in this parallel rtx 
> in operand 1 & 2.
> When I put a more restrictive constraint which forbid the use of post_inc, 
> the crash in cselib.c disappear.
>
> Question : What does GCC understand when the md describes a pattern allowing 
> the same post_inc MEM in 2 slot of a parallel rtx ?
> Is it forbidden ? the MEM address is supposed to be incremented twice ?
I think the semantics are defined by the PARALLEL.  Namely that the uses are 
evaluated, then side effects are performed.  So both sets use the value before 
incrementing.

The only question is what is the resulting value, and given the fundamental 
nature of PARALLEL, I think a single visible side effect is the most obvious 
answer.

Now having said that, there's a distinct possibility various passes don't 
handle this properly.

jeff


Re: post_inc mem in parallel rtx

2013-11-22 Thread Jeff Law

On 11/22/13 10:03, BELBACHIR Selim wrote:

Ok so I should avoid the auto_inc alternatives in PARALLEL. It's certainly a 
quite rare RTL and I doubt the effort worth it.

That'd be my inclination as well.

I'm not sure what chip you're working on, but those kind of 
multiple-output instructions tend to cause all kinds of performance 
problems once the chip goes to out-of-order execution.  Basically most 
folks designing the chip allow the operations to run independently, but 
they have to retire as a group.  Thus an insn like that would hold 3 
slots in the retirement buffer (two outputs plus embedded side effect) 
until all three operations are ready to retire.  That can be a real drag 
if the memory reference doesn't hit the cache.


jeff




RE: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Bingfeng Mei
So if we are about to fix this in if-conversion, we need to do both in tree & 
rtl as both ifcvt & ce passes cannot handle it. 

I am still not convinced jump threading is good for target with predicated 
execution (assuming no fix for if-conversion). I am doing benchmarking on our 
target now. 

Bingfeng

-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: 22 November 2013 16:58
To: Bingfeng Mei; Andrew Pinski
Cc: gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On 11/22/13 04:03, Bingfeng Mei wrote:
> Well, in your modified example, it is still due to jump threading that produce
> code of bad control flow that cannot be if-converted and vectorized, though in
> tree-vrp pass this time.
>
> Try this
> ~/install-4.8/bin/gcc vect-ifconv-2.c  -O2 -fdump-tree-ifcvt-details 
> -ftree-vectorize  -save-temps -fno-tree-vrp
>
> The code can be vectorized.
>
> Grep "threading" in gcc, it seems that dom and vrp passes are two places that 
> apply
> jump threading. Any other place? I think we need an target hook to control it.
No no.  The right thing to do is fix if-conversion.

jeff




Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Jeff Law

On 11/22/13 10:13, Bingfeng Mei wrote:

So if we are about to fix this in if-conversion, we need to do both in tree & rtl 
as both ifcvt & ce passes cannot handle it.

I am still not convinced jump threading is good for target with predicated 
execution (assuming no fix for if-conversion). I am doing benchmarking on our 
target now.

I'd be quite surprised if your tests show that it's not beneficial.

In simplest terms jump threading identifies conditional branches which 
can have their destination statically determined based on the path taken 
to the static branch.


And more generally, we try *real* hard not to start enabling/disabling 
tree passes on a per-target basis.  The end result if we were to start 
doing that is an unmaintainable mess.


Jeff



Re: cross compile & exceptions

2013-11-22 Thread Andrew Haley
On 11/22/2013 04:54 PM, BELBACHIR Selim wrote:
>>>
>>> So, to build libgcc I would need --without-header to compensate for my 
>>> small libc, and to build libstdc++ I would have to use --with-header in 
>>> order to provide stdio.h ...
>>>
>>>
>>> Do you know a better way to solve that than building gcc, libgcc & 
>>> libstdc++ independently ?
> 
>> What is $(TARGET) ?
> 
>> Andrew.
> 
> $(TARGET) is a private embedded platform (cpu/os/lib)

Right, but GCC is trying to build against unistd.h.  It's not going
to do that unless you tell it you have a UNIX-like target.

I'd start by building GCC against newlib.

Andrew.



Re: build broken on ppc linux?!

2013-11-22 Thread Mike Stump
On Nov 22, 2013, at 4:31 AM, Konstantin Serebryany 
 wrote:
> These CFI directives were completely removed in upstream at
> http://llvm.org/viewvc/llvm-project?rev=192196&view=rev
> Strangely, this did not get into the last merge...
> 
> Anyway, these cfi_* will (should, at least) disappear with the next
> merge which I hope to do in ~ 1 week.
> (Or anyone is welcome to delete these now as a separate commit, but
> please make sure the code matches the one in upstream)

This is exactly the patch referenced in the pointer to the upstream repo.  
Arno, does this fix the build for you?

Ok?

Index: libsanitizer/sanitizer_common/sanitizer_linux.cc
===
--- libsanitizer/sanitizer_common/sanitizer_linux.cc(revision 205278)
+++ libsanitizer/sanitizer_common/sanitizer_linux.cc(working copy)
@@ -785,7 +785,6 @@ uptr internal_clone(int (*fn)(void *), v
 *%r8  = new_tls,
 *%r10 = child_tidptr)
 */
-   ".cfi_endproc\n"
"syscall\n"
 
/* if (%rax != 0)
@@ -795,8 +794,9 @@ uptr internal_clone(int (*fn)(void *), v
"jnz1f\n"
 
/* In the child. Terminate unwind chain. */
-   ".cfi_startproc\n"
-   ".cfi_undefined %%rip;\n"
+   // XXX: We should also terminate the CFI unwind chain
+   // here. Unfortunately clang 3.2 doesn't support the
+   // necessary CFI directives, so we skip that part.
"xorq   %%rbp,%%rbp\n"
 
/* Call "fn(arg)". */


Re: build broken on ppc linux?!

2013-11-22 Thread Jakub Jelinek
On Fri, Nov 22, 2013 at 10:11:18AM -0800, Mike Stump wrote:
> On Nov 22, 2013, at 4:31 AM, Konstantin Serebryany 
>  wrote:
> > These CFI directives were completely removed in upstream at
> > http://llvm.org/viewvc/llvm-project?rev=192196&view=rev
> > Strangely, this did not get into the last merge...
> > 
> > Anyway, these cfi_* will (should, at least) disappear with the next
> > merge which I hope to do in ~ 1 week.
> > (Or anyone is welcome to delete these now as a separate commit, but
> > please make sure the code matches the one in upstream)
> 
> This is exactly the patch referenced in the pointer to the upstream repo.  
> Arno, does this fix the build for you?
> 
> Ok?

Yes (though, I really wonder why it needs to be removed rather than only
conditionally added based on preprocessor macros, but that is a question
for upstream).

Jakub


Re: build broken on ppc linux?!

2013-11-22 Thread Arnaud Charlet
> This is exactly the patch referenced in the pointer to the upstream repo.
> Arno, does this fix the build for you?

Well now I encounter:

/users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: In 
function '__sanitizer::uptr __sanitizer::internal_filesize(__sanitizer::fd_t)':
/users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:176:19:
 warning: 'st.stat::st_size' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
   return (uptr)st.st_size;
   ^

So I guess that's what we call "progress".

I'll keep using --disable-libsanitizer for the time being, this library is
clearly not quite productized yet IMO.

Arno


RE: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Bingfeng Mei
I understand what jump threading does. In theory it reduces number of 
instructions executed. But it creates messy program structure and prevents 
further optimizations, at least for target we have (VLIW-based DSP with 
predicated execution). 

I just ran through 8 audio codecs we use as internal benchmark. 5 out of 8 
codecs have similar performance with/without jump threading (give or take 
0.1-0.2%). For the other 3, no
jump threading version outperforms by 1-2.5%. I didn't even enable 
-ftree-vectorize.

I am going to do some further investigation and check whether if-conversion can 
be fixed without disabling jump threading.

Bingfeng 

-Original Message-
From: Jeff Law [mailto:l...@redhat.com] 
Sent: 22 November 2013 17:17
To: Bingfeng Mei; Andrew Pinski; Richard Biener
Cc: gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On 11/22/13 10:13, Bingfeng Mei wrote:
> So if we are about to fix this in if-conversion, we need to do both in tree & 
> rtl as both ifcvt & ce passes cannot handle it.
>
> I am still not convinced jump threading is good for target with predicated 
> execution (assuming no fix for if-conversion). I am doing benchmarking on our 
> target now.
I'd be quite surprised if your tests show that it's not beneficial.

In simplest terms jump threading identifies conditional branches which 
can have their destination statically determined based on the path taken 
to the static branch.

And more generally, we try *real* hard not to start enabling/disabling 
tree passes on a per-target basis.  The end result if we were to start 
doing that is an unmaintainable mess.

Jeff




Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Alec Teal

Hey,

What is jump threading? I've not heard of it before ( 
http://en.wikipedia.org/wiki/Jump_threading is basically the description 
of the compiler flag )


Alec

On 22/11/13 19:06, Bingfeng Mei wrote:

I understand what jump threading does. In theory it reduces number of 
instructions executed. But it creates messy program structure and prevents 
further optimizations, at least for target we have (VLIW-based DSP with 
predicated execution).

I just ran through 8 audio codecs we use as internal benchmark. 5 out of 8 
codecs have similar performance with/without jump threading (give or take 
0.1-0.2%). For the other 3, no
jump threading version outperforms by 1-2.5%. I didn't even enable 
-ftree-vectorize.

I am going to do some further investigation and check whether if-conversion can 
be fixed without disabling jump threading.

Bingfeng

-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: 22 November 2013 17:17
To: Bingfeng Mei; Andrew Pinski; Richard Biener
Cc: gcc@gcc.gnu.org
Subject: Re: Jump threading in tree dom pass prevents if-conversion & following 
vectorization

On 11/22/13 10:13, Bingfeng Mei wrote:

So if we are about to fix this in if-conversion, we need to do both in tree & rtl 
as both ifcvt & ce passes cannot handle it.

I am still not convinced jump threading is good for target with predicated 
execution (assuming no fix for if-conversion). I am doing benchmarking on our 
target now.

I'd be quite surprised if your tests show that it's not beneficial.

In simplest terms jump threading identifies conditional branches which
can have their destination statically determined based on the path taken
to the static branch.

And more generally, we try *real* hard not to start enabling/disabling
tree passes on a per-target basis.  The end result if we were to start
doing that is an unmaintainable mess.

Jeff






Re: proposal to make SIZE_TYPE more flexible

2013-11-22 Thread DJ Delorie

> (more precisely, for int128_integer_type_node to cease to exist and
> for any front-end places needing it to call a function, with a type
> size that should not be a constant 128).

The complications I've seen there is, for example, when you're
iterating through types looking for a "best" type, where some of the
types are fixed foo_type_node's and others are dynamic
intN_type_nodes.  Perhaps we sould use a hybrid list-plus-table
approach?  So we check for the standard types explicitly, then iterate
through the list of intN types?

> I can also believe it's appropriate for the global nodes for trees
> reflecting C ABI types to go somewhere other than tree.h.

Which are those?  Why isn't "int" one of those?

> I've no idea whether a table-driven API for anything would be a good 
> starting point.  That depends on a detailed analysis of the current 
> situation and its deficiencies for whatever you are proposing replacing 
> with such an API.

If you want to support more than one intN at a time, IMHO you need
more than just one intN object, hence a table of some sort.

Or are you assuming that any given backend would only be allowed to
define one intN type?  That's already not going to work, as I need
int20_t in addition to the "now standard" int128_t.

> I *am* reasonably confident that the places handling hardcoded lists
> of intQI_type_node, intHI_type_node, ... would better iterate over
> whatever supported integer modes may be present in the particular
> compiler configuration (and have some set of signed / unsigned /
> atomic types associated with integer modes) rather than hardcoding a
> list.

How is this different than the places handling hardcoded lists of
integer_type_node et al?

> It would not surprise me if some of the global type nodes either aren't 
> needed at all or, being only used for built-in functions, should actually 
> be defined in builtin-types.def rather than tree.[ch].  For example, 
> complex_integer_type_node and float_ptr_type_node.  But I don't think 
> cleaning up those would actually help in any way towards your goal; it 
> would be a completely orthogonal cleanup.

Yeah, I'm not trying to take on more work, just trying to hit the
prereqs for my own project.


Re: Jump threading in tree dom pass prevents if-conversion & following vectorization

2013-11-22 Thread Steven Bosscher
On Fri, Nov 22, 2013 at 6:16 PM, Jeff Law wrote:
>> I am still not convinced jump threading is good for target with predicated
>> execution (assuming no fix for if-conversion). I am doing benchmarking on
>> our target now.

Try disabling only jump threading of back edges, loop latches, and
jump threading in small loops.

Any "jump forwarding" is almost always a win.


> I'd be quite surprised if your tests show that it's not beneficial.
>
> In simplest terms jump threading identifies conditional branches which can
> have their destination statically determined based on the path taken to the
> static branch.

Still, optimizing away such conditional branches is not automatically a win.

There have always been issues with tree-ssa DOM doing jump-threading
so aggressively that other passes couldn't handle the resulting
control flow anymore, especially jump threading around/near loops.

Ciao!
Steven


Re: build broken on ppc linux?!

2013-11-22 Thread Jakub Jelinek
On Fri, Nov 22, 2013 at 07:21:07PM +0100, Arnaud Charlet wrote:
> > This is exactly the patch referenced in the pointer to the upstream repo.
> > Arno, does this fix the build for you?
> 
> Well now I encounter:
> 
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc: In 
> function '__sanitizer::uptr 
> __sanitizer::internal_filesize(__sanitizer::fd_t)':
> /users/charlet/fsf/trunk/libsanitizer/sanitizer_common/sanitizer_linux.cc:176:19:
>  warning: 'st.stat::st_size' may be used uninitialized in this function 
> [-Wmaybe-uninitialized]
>return (uptr)st.st_size;
>^
> 
> So I guess that's what we call "progress".
> 
> I'll keep using --disable-libsanitizer for the time being, this library is
> clearly not quite productized yet IMO.

Here is a patch to fix various warnings, the remaining ones I'm seeing are
mostly that libsanitizer uses incorrectly C90/C++98 ... in macros (the
standard require it to be non-empty), either use the GNU extension
instead, #define INTERCEPTOR(a, b, c...) and ,## c if needed to get rid
of the preceeding comma if empty (though, you compile with -pedantic, so
might get warnings about that too), or rework the macros or have different
ones for the zero argument cases (INTERCEPTOR0).

There are some additional warnings caused by the #ifdef SYSCALL_INTERCEPTION
hacks we have to avoid various issues with problematic kernel headers or
libsanitizer code not having non-i?86/x86_64 in mind.

The sanitizer_syscall_linux_x86_64.inc changes fix real bugs, the rest is
just to get the noise level down.

--- sanitizer_common/sanitizer_linux.cc.jj  2013-11-12 11:31:00.0 
+0100
+++ sanitizer_common/sanitizer_linux.cc 2013-11-22 20:15:26.652376137 +0100
@@ -216,7 +216,7 @@ uptr GetTid() {
 }
 
 u64 NanoTime() {
-  kernel_timeval tv = {};
+  kernel_timeval tv = {0, 0};
   internal_syscall(__NR_gettimeofday, (uptr)&tv, 0);
   return (u64)tv.tv_sec * 1000*1000*1000 + tv.tv_usec * 1000;
 }
--- sanitizer_common/sanitizer_syscall_linux_x86_64.inc.jj  2013-11-12 
11:31:00.0 +0100
+++ sanitizer_common/sanitizer_syscall_linux_x86_64.inc 2013-11-22 
20:14:32.752657581 +0100
@@ -11,7 +11,7 @@
 
 static uptr internal_syscall(u64 nr) {
   u64 retval;
-  asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11");
+  asm volatile("syscall" : "=a"(retval) : "a"(nr) : "rcx", "r11", "memory");
   return retval;
 }
 
@@ -19,7 +19,7 @@ template 
 static uptr internal_syscall(u64 nr, T1 arg1) {
   u64 retval;
   asm volatile("syscall" : "=a"(retval) : "a"(nr), "D"((u64)arg1) :
-   "rcx", "r11");
+   "rcx", "r11", "memory");
   return retval;
 }
 
@@ -27,7 +27,7 @@ template 
 static uptr internal_syscall(u64 nr, T1 arg1, T2 arg2) {
   u64 retval;
   asm volatile("syscall" : "=a"(retval) : "a"(nr), "D"((u64)arg1),
-   "S"((u64)arg2) : "rcx", "r11");
+   "S"((u64)arg2) : "rcx", "r11", "memory");
   return retval;
 }
 
@@ -35,7 +35,7 @@ template 

Re: build broken on ppc linux?!

2013-11-22 Thread Mike Stump
On Nov 22, 2013, at 10:13 AM, Jakub Jelinek  wrote:
>> This is exactly the patch referenced in the pointer to the upstream repo.  
>> Arno, does this fix the build for you?
>> 
>> Ok?
> 
> Yes

Committed revision 205285.


Re: proposal to make SIZE_TYPE more flexible

2013-11-22 Thread Joseph S. Myers
On Fri, 22 Nov 2013, DJ Delorie wrote:

> > (more precisely, for int128_integer_type_node to cease to exist and
> > for any front-end places needing it to call a function, with a type
> > size that should not be a constant 128).
> 
> The complications I've seen there is, for example, when you're
> iterating through types looking for a "best" type, where some of the
> types are fixed foo_type_node's and others are dynamic
> intN_type_nodes.  Perhaps we sould use a hybrid list-plus-table
> approach?  So we check for the standard types explicitly, then iterate
> through the list of intN types?

In general you need to analyze each such case individually to produce a 
reasoned argument for what it should logically be doing.  Given such 
analyses, maybe then you can identify particular tables of types in 
particular orders (for example) that should be set up to iterate through.

> > I can also believe it's appropriate for the global nodes for trees
> > reflecting C ABI types to go somewhere other than tree.h.
> 
> Which are those?  Why isn't "int" one of those?

I think "int" is one of them.  Those files that have a need for C ABI 
types would include tree-c-abi.h.  Optimizers that aren't e.g. generating 
calls to built-in functions where the C ABI is involved wouldn't include 
that header.  As this should be orthogonal to your project it could just 
as well be part of the tree.h cleanup project.

> > I've no idea whether a table-driven API for anything would be a good 
> > starting point.  That depends on a detailed analysis of the current 
> > situation and its deficiencies for whatever you are proposing replacing 
> > with such an API.
> 
> If you want to support more than one intN at a time, IMHO you need
> more than just one intN object, hence a table of some sort.
> 
> Or are you assuming that any given backend would only be allowed to
> define one intN type?  That's already not going to work, as I need
> int20_t in addition to the "now standard" int128_t.

I am saying that the starting point is understanding what is logically 
correct in the various different places dealing with integer types, and an 
analysis of that is what must drive any API design.

Does the target with __int20 actually have __int128 (i.e. pass 
targetm.scalar_mode_supported_p (TImode))?  But you should indeed be able 
to have an arbitrary number of such types.

> > I *am* reasonably confident that the places handling hardcoded lists
> > of intQI_type_node, intHI_type_node, ... would better iterate over
> > whatever supported integer modes may be present in the particular
> > compiler configuration (and have some set of signed / unsigned /
> > atomic types associated with integer modes) rather than hardcoding a
> > list.
> 
> How is this different than the places handling hardcoded lists of
> integer_type_node et al?

(a) We already have the system for an arbitrary set of integer modes to be 
defined and iterated over, whereas the set of standard C types is 
target-independent.

(b) This sort of thing tends to be more readily addressed through a series 
of small cleanup patches that clearly isolate anything that might possibly 
change behavior at all than through one huge patch.  So any small obvious 
things are naturally separated out.

(c) Iteration over C types has other complications such as preferences 
between different types (e.g. int and long) with the same middle-end 
properties.

(d) I don't think the standard C types are particularly relevant to your 
project.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: proposal to make SIZE_TYPE more flexible

2013-11-22 Thread DJ Delorie

> In general you need to analyze each such case individually to produce a 
> reasoned argument for what it should logically be doing.  Given such 
> analyses, maybe then you can identify particular tables of types in 
> particular orders (for example) that should be set up to iterate through.

Ok, I'll do this next, except it's what I did first... and we still
haven't decided how to handle some of those cases.  I'll re-analyze.

> Does the target with __int20 actually have __int128 (i.e. pass 
> targetm.scalar_mode_supported_p (TImode))?  But you should indeed be able 
> to have an arbitrary number of such types.

It doesn't support it, but it does *have* it.  In that the compiler
correcly parses the __int128 keyword and knows to tell you it isn't
supported.  So, it needs at least two keywords.  Which implies "it
needs two..." in other places.

And it's reasonable to expect that *someone* will want int16, int32,
etc types once a general solution is in place.

> (d) I don't think the standard C types are particularly relevant to your 
> project.

Should we be pulling the int128 support out of the integer_types[]
list and put it in the global_trees[] (or some table) list?  Because
most of the int128 support is tied in with the standard C type
handling, not the target-specific handling.


Question about CFLAGS/CXXFLAGS when building GCC

2013-11-22 Thread Steve Ellcey

I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I
want to build the compiler as a 32 bit executable.  I thought the right way
to do this was to do:

export CFLAGS='-O2 -g -m32'
export CXXFLAGS-'-O2 -g -m32'

before running configure and make.

This is working in that it created cc1 as a 32 bit executable like I wanted
it to but when the build continues and builds libgcc, it uses CFLAGS when
it is using the newly built gcc to compile libgcc.  That is wrong because the
GCC compiler that I just built (targeting MIPS) does not understand the
-m32 flag and I don't want to override the options used when building the
libraries anyway, only the options used to build executables.

Am I setting the wrong CFLAGS/CXXFLAGS variables?  Or is this a bug?

Steve Ellcey
sell...@mips.com



Re: Question about CFLAGS/CXXFLAGS when building GCC

2013-11-22 Thread H.J. Lu
On Fri, Nov 22, 2013 at 1:24 PM, Steve Ellcey  wrote:
>
> I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I
> want to build the compiler as a 32 bit executable.  I thought the right way
> to do this was to do:
>
> export CFLAGS='-O2 -g -m32'
> export CXXFLAGS-'-O2 -g -m32'
>
> before running configure and make.
>
> This is working in that it created cc1 as a 32 bit executable like I wanted
> it to but when the build continues and builds libgcc, it uses CFLAGS when
> it is using the newly built gcc to compile libgcc.  That is wrong because the
> GCC compiler that I just built (targeting MIPS) does not understand the
> -m32 flag and I don't want to override the options used when building the
> libraries anyway, only the options used to build executables.
>
> Am I setting the wrong CFLAGS/CXXFLAGS variables?  Or is this a bug?
>

Can you not touch CFLAGS/CXXFLAGS? Instead, you do

# CC="gcc -m32" CXX="g++ -m32" .../configure 
# make CC="gcc -m32" CXX="g++ -m32" ...

-- 
H.J.


Re: Question about CFLAGS/CXXFLAGS when building GCC

2013-11-22 Thread Eric Botcazou
> I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I
> want to build the compiler as a 32 bit executable.  I thought the right way
> to do this was to do:
> 
> export CFLAGS='-O2 -g -m32'
> export CXXFLAGS-'-O2 -g -m32'
> 
> before running configure and make.
> 
> This is working in that it created cc1 as a 32 bit executable like I wanted
> it to but when the build continues and builds libgcc, it uses CFLAGS when
> it is using the newly built gcc to compile libgcc.

The usual way to do this is to set CC and CXX at the configure stage:

CC="gcc -m32" CXX="g++ -m32" $(srcdir)/configure ...

-- 
Eric Botcazou


Re: Question about CFLAGS/CXXFLAGS when building GCC

2013-11-22 Thread Steve Ellcey
On Fri, 2013-11-22 at 13:48 -0800, H.J. Lu wrote:
> On Fri, Nov 22, 2013 at 1:24 PM, Steve Ellcey  wrote:
> >
> > I am building a cross GCC (targeting MIPS) on an x86-64 Linux system but I
> > want to build the compiler as a 32 bit executable.  I thought the right way
> > to do this was to do:
> >
> > export CFLAGS='-O2 -g -m32'
> > export CXXFLAGS-'-O2 -g -m32'
> >
> > before running configure and make.
> >
> > This is working in that it created cc1 as a 32 bit executable like I wanted
> > it to but when the build continues and builds libgcc, it uses CFLAGS when
> > it is using the newly built gcc to compile libgcc.  That is wrong because 
> > the
> > GCC compiler that I just built (targeting MIPS) does not understand the
> > -m32 flag and I don't want to override the options used when building the
> > libraries anyway, only the options used to build executables.
> >
> > Am I setting the wrong CFLAGS/CXXFLAGS variables?  Or is this a bug?
> >
> 
> Can you not touch CFLAGS/CXXFLAGS? Instead, you do
> 
> # CC="gcc -m32" CXX="g++ -m32" .../configure 
> # make CC="gcc -m32" CXX="g++ -m32" ...

Doh.  I don't know why that didn't occur to me.  It should work
and that is what I will do.

Steve




Re: proposal to make SIZE_TYPE more flexible

2013-11-22 Thread Joseph S. Myers
On Fri, 22 Nov 2013, DJ Delorie wrote:

> > Does the target with __int20 actually have __int128 (i.e. pass 
> > targetm.scalar_mode_supported_p (TImode))?  But you should indeed be able 
> > to have an arbitrary number of such types.
> 
> It doesn't support it, but it does *have* it.  In that the compiler
> correcly parses the __int128 keyword and knows to tell you it isn't
> supported.  So, it needs at least two keywords.  Which implies "it
> needs two..." in other places.

Making __int20 and __int128 exactly similar indicates __int128 *not* being 
a keyword on targets not supporting it.

> And it's reasonable to expect that *someone* will want int16, int32,
> etc types once a general solution is in place.

As previously noted, it's best only to define such types where (a) there 
is an integer mode passing targetm.scalar_mode_supported_p and (b) no 
standard C type matches, to avoid issues with whether __int32 is the same 
as int or not.

> > (d) I don't think the standard C types are particularly relevant to your 
> > project.
> 
> Should we be pulling the int128 support out of the integer_types[]
> list and put it in the global_trees[] (or some table) list?  Because
> most of the int128 support is tied in with the standard C type
> handling, not the target-specific handling.

My guess is that the int128 support doesn't belong in any of the existing 
global arrays, but in some new arrays supporting whatever set of intN 
types the target has.  That's just a guess; whether you follow or don't 
follow it, your analysis of the code needs to justify your choice.

-- 
Joseph S. Myers
jos...@codesourcery.com