Fortran regressions on Cygwin_NT

2007-08-15 Thread Paul Thomas

The failures below have all come up in the last few days using

GNU Fortran (GCC) 4.3.0 20070815 (experimental)

on Cygwin_NT/amd64

Cheers

Paul

FAIL: gfortran.dg/g77/980310-3.f (internal compiler error)
FAIL: gfortran.dg/g77/980310-3.f (test for excess errors)
Running /svn/trunk/gcc/testsuite/gfortran.dg/gomp/gomp.exp ...
Running /svn/trunk/gcc/testsuite/gfortran.dg/vect/vect.exp ...
Running 
/svn/trunk/gcc/testsuite/gfortran.fortran-torture/compile/compile.exp ..

.
Running 
/svn/trunk/gcc/testsuite/gfortran.fortran-torture/execute/execute.exp ..

.
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  -O0
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  -O1
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  -O2
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  
-O3 -fo

mit-frame-pointer
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  
-O3 -fo

mit-frame-pointer -funroll-loops
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  
-O3 -fo

mit-frame-pointer -funroll-all-loops -finline-functions
FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  
-O3 -g


FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  -Os




Re: Fortran regressions on Cygwin_NT

2007-08-15 Thread François-Xavier Coudert
> FAIL: gfortran.dg/g77/980310-3.f (internal compiler error)
> FAIL: gfortran.dg/g77/980310-3.f (test for excess errors)

I saw this one on x86_64-linux with -m32, and filed it as PR33074. I
asked about it on IRC yesterday, and if I understood Andrew Pinksi, it
probably is a middle-end problem, as people have been messing with
reload recently.

> FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90 execution,  -O0

This one apparently appeared between rev. 127178 and 2007-08-06 (see
http://gcc.gnu.org/ml/gcc-testresults/2007-08/msg00161.html and
http://gcc.gnu.org/ml/gcc-testresults/2007-08/msg00278.html; there is
no revision number for the second one), and it is also seen on a few
platforms. It probably was introduced by me (recent NINT patch) and
fixed as per http://gcc.gnu.org/ml/gcc-patches/2007-08/msg00902.html


FX


Re: Fortran regressions on Cygwin_NT

2007-08-15 Thread Tobias Burnus
Paul Thomas wrote:
> FAIL: gfortran.dg/g77/980310-3.f (internal compiler error)
> FAIL: gfortran.dg/g77/980310-3.f (test for excess errors)
I get the same error on x86-64/openSUSE with "-m32 -O" with -m64 and
without "-O" it works.

FX reported it yesterday as PR 33074.

> FAIL: gfortran.fortran-torture/execute/intrinsic_integer.f90
> execution,  -O0
Works here (with both -m32 and -m64; incl. under valgrind).

I only get a failure for random_7.f90 (PR33077).

Tobias


Re: GCC 4.3.0 Status Report (2007-08-09)

2007-08-15 Thread Jan Hubicka
> Jan Hubicka wrote:
> 
> > One thing I would like to see in is the sharing checker.  The criteria
> > of bootstrap/regtesting on primary platforms is almost met now with
> > exception of regmove pass that I sent patch for some time ago.
> > http://gcc.gnu.org/ml/gcc-patches/2006-12/msg01441.html
> > I will do re-testing now and see if some new problems has appeared.
> 
> Thank you for bringing this up.  I'd let to get the checker in too.
> But, I don't really understand the regrename.c patch.  Are you saying
> that regrename.c is broken, and that we need to make these copies
> because of a real bug?  Or just to make the checker happy?  If the

Introducing wrong sharing is real bug :) But I know of no testcase where
it leads to ICE or produce wrong code without checker. Regrename is run
late, sharing is introduced just for complex instruction patterns and
not too many passes afterwards cares about sharing.

The copying occurs only when nontrivial RTX expressions are matched
that happens generally only in combiner patterns dealing with arithmetic
and corresponding set of flags that are not terribly common, so it is
sub 1% memory use growth on combine.c and PPC, 0% on i386.

However I am no longer sure I fully understand why the sharing is needed
at first place - regrename seems to have later mechanizm to deal with
match_dup and it seems to me that it only can result in mismatch when
there was invalid sharing before regrename introduced (so updating the
insn caused one copy of the matched RTX to be alterned but no other
copy).

I am now re-testing alternate patch that simply disables the code
introducing sharing in a hope that it will was just symptomatic fix for
sharing issue orignally and it will simply pass now.  I will know
results tonight.

Honza
> latter, have you measured the compile-time and memory usage to see what
> impact that has?  We'd like to avoid making the compiler slower just to
> make the checker happy -- but, of course, it might be worth a small hit
> to get the checking benefit.
> 
> Thanks,
> 
> -- 
> Mark Mitchell
> CodeSourcery
> [EMAIL PROTECTED]
> (650) 331-3385 x713


bootstrapping with -fopenmp

2007-08-15 Thread Razya Ladelsky
Hi,

I'm trying to bootstrap (parloop branch) with -ftree-parallelize-loops=4, 
which requires also -fopenmp. 
I'm using: make BOOTCFLAGS="-O2 -ftree-parallelize-loops=4 -fopenmp" 
bootstrap -j 16
I'm failing at the begining of stage2 because the compiler can't find 
libgomp.spec

How do I bootstrap correctly with fopenmp?

Thanks,
Razya



treelang: can we replace 'unsigned char *chars' by 'char *chars'?

2007-08-15 Thread Lemaitre Laurent-r29173
Hi All,

In file treelang.h structure token_part is defined as follows:
struct token_part GTY(())
{
  location_t location;
  unsigned int charno;
  unsigned int length; /* The value.  */
  const unsigned char *chars;  <-- HERE
};

'unsigned char *chars' is used instead of just 'char *chars'.
Is-there any reason (speed, memory) why 'unsigned' is used?

I am building an autogenerated version of 'treelang' and I am trying
to generate directly 'tree' node in file parse.y. That
is the reasoin why I am asking.

Thanks,

Laurent


Re: bootstrapping with -fopenmp

2007-08-15 Thread Paolo Bonzini

Razya Ladelsky wrote:

Hi,

I'm trying to bootstrap (parloop branch) with -ftree-parallelize-loops=4, 
which requires also -fopenmp. 
I'm using: make BOOTCFLAGS="-O2 -ftree-parallelize-loops=4 -fopenmp" 
bootstrap -j 16
I'm failing at the begining of stage2 because the compiler can't find 
libgomp.spec


How do I bootstrap correctly with fopenmp?


You have to add bootstrap=true to libgomp, and regenerate Makefile.in.

Paolo


RE: gcc on SCO

2007-08-15 Thread Williams, Gerald S (Jerry)
Dave Korn wrote:
> But consider also
> http://gcc.gnu.org/svn/gcc/trunk/README.SCO

Which calls them "not a serious threat." I hadn't been closely
following this, but that sure seems to be the case given last
week's ruling.

 
http://arstechnica.com/news.ars/post/20070812-sco-never-owned-unix-copyr
ights-owes-novell-95-percent-of-unix-royalties.html

 
http://en.wikipedia.org/wiki/Sco_group#SCO-Linux_lawsuits_and_controvers
ies

gsw


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Andrew Pinski
On 8/15/07, Zack Weinberg <[EMAIL PROTECTED]> wrote:
> Is popcount really slow on PowerPC?  (Compared to clz?)
popcount is really popcount in bytes and then you do a multiple to get
the real popcount.  This is why it is slower than count leading zeros.
 Also popcount does not exist in most powerpc's while count leading
zeros exist in all.

Thanks,
Andrew Pinski


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Zack Weinberg

Andrew Pinski wrote:

On 8/15/07, Zack Weinberg <[EMAIL PROTECTED]> wrote:

Is popcount really slow on PowerPC?  (Compared to clz?)

popcount is really popcount in bytes and then you do a multiple to get
the real popcount.  This is why it is slower than count leading zeros.
 Also popcount does not exist in most powerpc's while count leading
zeros exist in all.


Makes sense.  I don't suppose I could persuade you to teach rs6000 
RTX_COSTS about clz and popcount...?


zw



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Zack Weinberg

Segher Boessenkool wrote:

* I would like to do the same for __builtin_ctz, but there is a catch.
The synthetic ctz sequence in terms of popcount (as presently
implemented by ia64.md, and potentially usable for at least i386 and
rs6000 as well if moved to optabs.c) produces the canonical behavior at
zero, but the synthetic sequence in terms of clz (as presently
implemented by optabs.c) produces the value -1 at zero.


I suppose you're using (assuming 32-bit)

ctz(x) := 31 - clz(x & -x)

now, which gives -1 for 0; and the version you're looking for is

ctz(x) := 32 - clz(~x & (x-1))

which gives 32 for 0.


Thanks!  That's, unfortunately, one more instruction, although I guess a 
lot of chips have "a & ~b" as one operation.



What does the popcount version look like?  Never seen that before,
but I think it will be really expensive on PowerPC.


  ctz(x) := popcount(~x & (x-1))

Just the same thing as your version of the ctz-as-clz operation, but 
without the final adjustment.  It looks like ~x & (x-1) turns any number 
into 000...111... where the boundary between zeroes and ones lies at the 
lowest 1 in the original.


Is popcount really slow on PowerPC?  (Compared to clz?)  Ideally one 
would choose between the two expansions based on RTL costs, but the only 
 architectures it matters for are i386 and powerpc, and neither of them 
define the cost of either clz or popcount.


zw


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Zack Weinberg

Joern Rennecke wrote:

The score, sh and sparc instructions may or may not display canonical
behavior; their ports do not define CLZ_DEFINED_VALUE_AT_ZERO and I was
not able to find documentation of the relevant instruction.


The operation the nsb instruction of the SHmedia instruction set performs
is 'count number of sign bit copies'.
[...]


It sounds like the SH should probably be lumped in with the x86 as not
doing "canonical behavior".  Conveniently enough for my grand plan, it
already uses an UNSPEC for the actual instruction :-)

What is the result of the instruction for (64-bit) all-bits-zero or
all-bits-one?  64?  Assuming so, it occurs to me that the result of an
unsigned clz() on any negative 64-bit value will be zero; thus, you
could get a "canonical" clz out of nsb by doing (pseudo-assembly)

mov result, 0
cmp/pz  arg
bf  1f
nsb result, arg
1:

Similarly, the x & (x-1) operation used to set up for ctz/ffs in terms
of clz will leave the high bit set *only* for x == 0x8000  
; which can be tested for as x == (x&(x-1)) and the nsb skipped.

Would these sequences be slower than the current logic?


The ARC700 has a NORM instruction, which again counts the number of
sign bit copies.  There is a variant NORM.F which sets the N flag if the
input is negative.


Sorry, I don't recognize the ARC700 - which GCC back end is that?  It
might be worth teaching optabs.c about sign-bit-count operations, but
only if we have more than one architecture that can use it.

zw



Re: Announce: VCG support for Graph::Easy

2007-08-15 Thread Tels
Moin,

On Sunday 12 August 2007 20:11:34 Tels wrote:
> Moin,

The signature on my email was bad/broken, when it came back to me from the 
mailing-list. Did this happen to anybody else?

Since this never happened to me before, here is another email, as test. 
Let's see if the signature is still bad (e.g. the list garbles my text 
somehow).

Sorry for the noise,

Tels

-- 
 Signed on Sun Aug 12 23:44:32 2007 with key 0x93B84C15.
 View my photo gallery: http://bloodgate.com/photos
 PGP key on http://bloodgate.com/tels.asc or per email.

 "Duke Nukem Forever is a 1999 game and we think that timeframe matches
 very well with what we have planned for the game."

  -- George Broussard, 1998 (http://tinyurl.com/6m8nh)


pgpoFnI2AQdq2.pgp
Description: PGP signature


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread David Edelsohn
I think the cost would be something like:

Index: rs6000.c
===
--- rs6000.c(revision 127484)
+++ rs6000.c(working copy)
@@ -20292,10 +20292,15 @@
*total += COSTS_N_INSNS (2);
   return false;
 
+case CTZ:
 case FFS:
   *total = COSTS_N_INSNS (4);
   return false;
 
+case POPCOUNT:
+  *total = COSTS_N_INSNS (3);
+  return false;
+
 case NOT:
   if (outer_code == AND || outer_code == IOR || outer_code == XOR)
{
@@ -20305,6 +20310,7 @@
   /* FALLTHRU */
 
 case AND:
+case CLZ:
 case IOR:
 case XOR:
 case ZERO_EXTRACT:


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Segher Boessenkool

I suppose you're using (assuming 32-bit)
ctz(x) := 31 - clz(x & -x)
now, which gives -1 for 0; and the version you're looking for is
ctz(x) := 32 - clz(~x & (x-1))
which gives 32 for 0.


Thanks!  That's, unfortunately, one more instruction, although I guess 
a lot of chips have "a & ~b" as one operation.


Yes, it's exactly the same cost on PowerPC, and on most other
RISC architectures.

It looks like ~x & (x-1) turns any number into 000...111... where the 
boundary between zeroes and ones lies at the lowest 1 in the original.


Exactly.  "To the right of the lowest 1".

Is popcount really slow on PowerPC?  (Compared to clz?)  Ideally one 
would choose between the two expansions based on RTL costs, but the 
only  architectures it matters for are i386 and powerpc, and neither 
of them define the cost of either clz or popcount.


Andrew answered this already.  Adding clz/popcount to the cost
tables seems like a good idea, yes.


Segher



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread David Edelsohn
> Zack Weinberg writes:

Zack> Makes sense.  I don't suppose I could persuade you to teach rs6000 
Zack> RTX_COSTS about clz and popcount...?

Sure.  It's not that difficult to add to the table.

David



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Jan Hubicka
> 
> Is popcount really slow on PowerPC?  (Compared to clz?)  Ideally one 
> would choose between the two expansions based on RTL costs, but the only 
>  architectures it matters for are i386 and powerpc, and neither of them 
> define the cost of either clz or popcount.

Of course adding a popcount/clz cost into i386 cost tables is easy and
probably most correct thing to do :)

Honza
> 
> zw


Re: Announce: VCG support for Graph::Easy

2007-08-15 Thread Jan-Benedict Glaw
On Sun, 2007-08-12 23:45:09 +0200, Tels <[EMAIL PROTECTED]> wrote:
> 
> The signature on my email was bad/broken, when it came back to me from the 
> mailing-list. Did this happen to anybody else?
> 
> Since this never happened to me before, here is another email, as test. 
> Let's see if the signature is still bad (e.g. the list garbles my text 
> somehow).

This unfortunately happens regularly on this list. I don't think if
this is considered a problem or only annoying.

What actually *is* a problem is that you're most probably using a
non-working "From: " header.  Or do you actually read nospam-abuse?

MfG, JBG

-- 
  Jan-Benedict Glaw  [EMAIL PROTECTED]  +49-172-7608481
Signature of:   ...und wenn Du denkst, es geht nicht mehr,
the second  :  kommt irgendwo ein Lichtlein her.


signature.asc
Description: Digital signature


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Segher Boessenkool

I think the cost would be something like:



+case POPCOUNT:
+  *total = COSTS_N_INSNS (3);
+  return false;


Is that the cost when using popcountb?  It is a lot more
expensive when that instruction isn't available (like on
most current machines).

The rest (i.e. CLZ, CTZ) looks good to me.


Segher



Re: Announce: VCG support for Graph::Easy

2007-08-15 Thread Tels
Moin,

On Wednesday 15 August 2007 21:30:16 Jan-Benedict Glaw wrote:
> On Sun, 2007-08-12 23:45:09 +0200, Tels <[EMAIL PROTECTED]> 
wrote:
> > The signature on my email was bad/broken, when it came back to me from
> > the mailing-list. Did this happen to anybody else?
> >
> > Since this never happened to me before, here is another email, as test.
> > Let's see if the signature is still bad (e.g. the list garbles my text
> > somehow).
>
> This unfortunately happens regularly on this list. I don't think if
> this is considered a problem or only annoying.

It happened to my second mail, too, and lazy inspection shows that probably 
just some lines are wrapped. Still, very annoying as it breaks my 
signatures and I consider it a problem.

(Of course, apart from this announcement, I don't intent to post much on 
this list, except maybe if someone asks me a VCG related question.) 

> What actually *is* a problem is that you're most probably using a
> non-working "From: " header.  Or do you actually read nospam-abuse?

Of course. Why should I not read it?

All the best,

Tels


-- 
 Signed on Wed Aug 15 22:02:13 2007 with key 0x93B84C15.
 View my photo gallery: http://bloodgate.com/photos
 PGP key on http://bloodgate.com/tels.asc or per email.

 "We have problems like this all of the time," Kirk said, trying to
 reassure me.  "Sometimes its really hard to get things burning."

  -- http://tinyurl.com/qmg5


pgpNf1B9yb0mV.pgp
Description: PGP signature


Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread David Edelsohn
> Segher Boessenkool writes:

>> I think the cost would be something like:
>> +case POPCOUNT:
>> +  *total = COSTS_N_INSNS (3);
>> +  return false;

Segher> Is that the cost when using popcountb?  It is a lot more
Segher> expensive when that instruction isn't available (like on
Segher> most current machines).

Yes, but do we even create POPCOUNT rtx if the insn isn't
supported?  Wouldn't we expand or create libcall early?

David



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Segher Boessenkool

I think the cost would be something like:
+case POPCOUNT:
+  *total = COSTS_N_INSNS (3);
+  return false;


Segher> Is that the cost when using popcountb?  It is a lot more
Segher> expensive when that instruction isn't available (like on
Segher> most current machines).

Yes, but do we even create POPCOUNT rtx if the insn isn't
supported?  Wouldn't we expand or create libcall early?


I don't know, there's only one way to find out... :-)


Segher



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread David Edelsohn
> Segher Boessenkool writes:

>> Yes, but do we even create POPCOUNT rtx if the insn isn't
>> supported?  Wouldn't we expand or create libcall early?

Segher> I don't know, there's only one way to find out... :-)

I did check.  Didn't you?

David



Re: RFC: Simplify rules for ctz/clz patterns and RTL

2007-08-15 Thread Joern Rennecke
On Wed, Aug 15, 2007 at 11:55:02AM -0700, Zack Weinberg wrote:
> Joern Rennecke wrote:
> >The operation the nsb instruction of the SHmedia instruction set performs
> >is 'count number of sign bit copies'.
> >[...]
> 
> It sounds like the SH should probably be lumped in with the x86 as not 
> doing "canonical behavior".  Conveniently enough for my grand plan, it 
> already uses an UNSPEC for the actual instruction :-)
> 
> What is the result of the instruction for (64-bit) all-bits-zero or 
> all-bits-one?  64?

No, it is 63.  There is one essential sign bit and 63 more copies.

> Assuming so, it occurs to me that the result of an 
> unsigned clz() on any negative 64-bit value will be zero; thus, you 
> could get a "canonical" clz out of nsb by doing (pseudo-assembly)
> 
>   mov result, 0
>   cmp/pz  arg
>   bf  1f
>   nsb result, arg
> 1:

We are talking about SHmedia code here.  cmp/pz and bf are not SHmedia
instructions.  Loading a zero into result would be movi 0,result .

If you want to special-case the negative input, that would be:

 shari  arg,31,tmp
 nsbarg,result
 cmvne  tmp,tmp,result
 addi   result,1,result

> Similarly, the x & (x-1) operation

It's x ^ (x-1) (xor) or x &~(x-1) (andc)

> used to set up for ctz/ffs in terms 
> of clz will leave the high bit set *only* for x == 0x8000   
> ; which can be tested for as x == (x&(x-1)) and the nsb skipped.
> 
> Would these sequences be slower than the current logic?

currently we have for ffs:

 addi   arg,-1,tmp
 xorarg,tmp,tmp
 shlri  tmp,1,tmp
 nsbtmp,tmp
 addi   tmp,-64,tmp
 cmveq  arg,r63,tmp
 subr63,tmp,result

Using the above sequence, we get the more register-hungry:

 addi   arg,-1,tmp
 xorarg,tmp,tmp
 shari  tmp,31,tmp2
 nsbtmp,tmp
 cmvne  tmp2,tmp2,tmp
 addi   tmp,1-64,tmp
 subr63,tmp,result

you propose:

 pt after_nsb,trtmp
 addi   arg,-1,tmp
 andc   arg,tmp,tmp
 movi   -1,tmp2
 beqarg,tmp,trtmp
 nsbtmp,tmp2
after_nsb:
 addi   tmp2,-63,tmp
 subr63,tmp,result

or is that:

 pt after_nsb,trtmp
 addi   arg,-1,tmp
 xorarg,tmp,tmp
 bgtr63,tmp,trtmp
 nsbtmp,tmp
after_nsb:
 addi   tmp,-63,tmp
 subr63,tmp,result
 
At any rate, the introduction of the branch makes the code worse.

But for the ARC, it would make an interesting shortcut.  Although norm can't
be conditionalized, we can use the -1 from the xor to save on a long
immediate for 32 bit ffs.

 sub_s  tmp,arg,1
 xor.f  tmp,tmp,arg ; for -Os this can be xor_s and
 norm   result,tmp  ; then norm.f produces the flag.
 mov.mi result,tmp
 rsub   result,31,result

> >The ARC700 has a NORM instruction, which again counts the number of
> >sign bit copies.  There is a variant NORM.F which sets the N flag if the
> >input is negative.
> 
> Sorry, I don't recognize the ARC700 - which GCC back end is that?

It belongs in config/arc ; however, proper ARC700 support is not in the
FSF mainline yet.
We are working on it.

> It 
> might be worth teaching optabs.c about sign-bit-count operations, but 
> only if we have more than one architecture that can use it.

The NORM instruction is also available as an optional extension operation
for ARCtangent-A5 and ARC600.


gcc-4.2-20070815 is now available

2007-08-15 Thread gccadmin
Snapshot gcc-4.2-20070815 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.2-20070815/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.2 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_2-branch 
revision 127526

You'll find:

gcc-4.2-20070815.tar.bz2  Complete GCC (includes all of below)

gcc-core-4.2-20070815.tar.bz2 C front end and core compiler

gcc-ada-4.2-20070815.tar.bz2  Ada front end and runtime

gcc-fortran-4.2-20070815.tar.bz2  Fortran front end and runtime

gcc-g++-4.2-20070815.tar.bz2  C++ front end and runtime

gcc-java-4.2-20070815.tar.bz2 Java front end and runtime

gcc-objc-4.2-20070815.tar.bz2 Objective-C front end and runtime

gcc-testsuite-4.2-20070815.tar.bz2The GCC testsuite

Diffs from 4.2-20070627 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.2
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.