Re: GCC 4.0 RC2 Available

2005-04-19 Thread Richard Sandiford
Mark Mitchell <[EMAIL PROTECTED]> writes:
> RC2 is available here:
>
>   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
>
> As before, I'd very much appreciate it if people would test these bits
> on primary and secondary platforms, post test results with the
> contrib/test_summary script, and send me a message saying whether or
> not there are any regressions, together with a pointer to the results.

Results for mips-elf are here:

   http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01331.html

and look good.  No regressions.

Richard


Re: MIPS, libsupc++ and -G 0

2005-04-19 Thread Richard Sandiford
Jonathan Larmour <[EMAIL PROTECTED]> writes:
> On MIPS, libgcc is built with -G 0, which is used to ensure the contents
> don't assume they will be placed in the small data/bss section. Setting
> -G 0 is used to allow for the possibility of large applications, or
> those where even small data may be located more than 64k away from the
> gp pointer.
>
> However this is not done with libsupc++ or libstdc++. The result is that
> for some of my embedded applications, which require -G 0 themselves,
> "stderr" is far away from the gp pointer. This shouldn't matter except
> that vterminate.cc in libsupc++ was not compiled with -G 0 and thus is
> expecting to be able to use a 16-bit gp relative relocation, thus we get
> a link failure.
>
> Was this a conscious decision or an accident? Is the best route for me
> to just add -G 0 for all mips libstdc++/libsupc++, and submit that as a
> patch?

It sounds like you think -G0 should be a safe setting for everyone,
but it isn't really.  You can't necessarily compile libraries with
-G0 and then link them against code compiled with -G8.  libgcc is
an exception rather than the norm here.

libstdc++/libsupc++ have (or least potentially have) various global
variables that fit in the default -G8 limit.  So suppose:

foo.cc in libstdc++ defines a variable "int x"
foo.h in libstdc++ declares a variable "extern int x"
inline function bar() in foo.h refers to "x"
user code calls bar()

The user code will inline bar() and refer directly to "x".  If the
user code is compiled with -G8, it will expect "x" to be in the small
data section.  It won't link properly if foo.cc was compiled with -G0.

The only reliable way to get what you want is to either (a) add -G0
multilibs or (b) change the default -G setting.  Perhaps a configure
option would be useful here.  Maybe something like --with-sdata-limit,
to go alongside options like --with-arch and --with-tune?

Richard


Re: GCC 4.0 RC2 Available

2005-04-19 Thread James E Wilson
commented onMark Mitchell wrote:
The changes that I anticipate between now and the final release are
(a) documentation changes, (b) a patch for 20991, and (c) a possible
patch for 20973.  Other than that, I will only consider patches that
fix egregious problems, like a fail to bootstrap on a primary
platform.
I put comments in PR 20973.  It seems reasonable for a reload patch, 
i.e. only moderately dangerously unsafe as opposed to severely 
dangerously unsafe.

However, there is a bigger issue for Itanium.  The example has an 
uninitialized register read.  On Itanium, this can result in a later NaT 
consumption fault if that uninitialized register happens to have the NaT 
bit set.  That will result in a core dump.  This problem will likely be 
extremely rare, as gcc doesn't have speculation support yet, and hence 
can't create NaT bits, but there are a few hand written glibc routines 
that use them.  I think this is a serious problem for Itanium.

This seems to be an unfortunate side effect of tree-ssa code to 
decompose structure references.  The old rtl code (e.g. 
store_constructor) was careful to always initialize a structure to zero 
if it was in a register, so that we did not have any uninitialized 
register reads.  The new tree-ssa code makes no attempt to do this.  If 
this problem is fixed, then we should not need any reload patch.

This can perhaps be fixed in flow by adding code to initialize registers 
that are used before they are set.  We already have some code for this 
in initialize_uninitialized_subregs, but it doesn't handle the general 
case.  Of course, trying to fix this at such a late stage might be just 
as risky as the reload patch.  Tis a shame I didn't notice this earlier.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: MIPS, libsupc++ and -G 0

2005-04-19 Thread Zack Weinberg
Richard Sandiford <[EMAIL PROTECTED]> writes:

> The only reliable way to get what you want is to either (a) add -G0
> multilibs or (b) change the default -G setting.  Perhaps a configure
> option would be useful here.  Maybe something like --with-sdata-limit,
> to go alongside options like --with-arch and --with-tune?

Or perhaps an -m option to put stuff in .sdata as normal, but generate
code as if nothing is in there?

zw


Re: MIPS, libsupc++ and -G 0

2005-04-19 Thread Richard Sandiford
Zack Weinberg <[EMAIL PROTECTED]> writes:
> Richard Sandiford <[EMAIL PROTECTED]> writes:
>> The only reliable way to get what you want is to either (a) add -G0
>> multilibs or (b) change the default -G setting.  Perhaps a configure
>> option would be useful here.  Maybe something like --with-sdata-limit,
>> to go alongside options like --with-arch and --with-tune?
>
> Or perhaps an -m option to put stuff in .sdata as normal, but generate
> code as if nothing is in there?

Maybe (and I realise other ports do) but in some ways it gives the worst
of both worlds.  libsupc++ and libstdc++ will end up eating chunks of the
small data area without getting any real benefit from it.

A configure-time option is likely to be more convenient for folks who
use -G0 because you don't have to coerce every build system to add it
on the command line.  And it wouldn't penalise those who want to use
the usual -G8.

Richard


different address spaces (was Re: internal compiler error at dwarf2out.c:8362)

2005-04-19 Thread Martin Koegler
James E Wilson wrote:
>Björn Haase wrote: 
>>In case that one should not use machine specific atttributes, *is* 
>>there a standard way for GCC how to implement different address spaces? 
>
>Use section attributes to force functions/variables into different sections, 
>and then use linker scripts to place different sections into different address 
>spaces. You can define machine dependent attributes as short-hand for a 
>section 
>attribute, and presumably the eeprom attribute is an example of that.
> 
>The only thing wrong with the eeprom attribute is that it is trying to create 
>its own 
>types. It is not necessary to create new types in order to get variables 
>placed into 
>special sections. There is nothing wrong with the concept of having an eeprom 
>attribute.

Placing variables in a specfic section is only a part of the problem.
With eg. the AVR progmem attribute, the data is only placed in the Flash memory.
If you want to access it, you need to pass the address to function, which
does the read or write operation (The last version, I used was 3.x based, so
this may have change).

I want to accomplish with the eeprom attribute, a transparent access to such 
memory
regions, where for variables, whose type has this attribute, or pointers, which 
point to
a type, which has this attribute, different RTL is generated, which calls these 
functions
implicitly.

This approach is not perfect (assign a pointer having not this attribute to a
pointer having this attribute using a type cast).

For placing variables, I use the section attribute, so I can support multiple
eeprom regions. This is a specific requirement for my project, as some eeprom 
data
must be put in the region 0x100-0x1ff, whereas other data can be but at any 
eeprom
location.

For the AVR port, two such attributes would be needed (eeprom and progmem).

My prototype for the m68hc05 does currently the following (based on GCC 4.1):
* A predicate was created, which checks for MEM expression, if the type, either
in the memory attributes or in the register attributes for (mem:XX (reg:XX 
..)),  
contains the eeprom attribute
*The move expander check, if a operand matches the eeprom predicate. If this is 
true,
a different RTL is generated (in my case a call the the library function).
*All other insns reject such an operand, so I had not to change them
*In set_mem_attributes_minus_bitpos, I always store the expression.
 This has caused for my port no regression. As now a register/memory attribute
 may contain an expression, that is not a DECL, I also had to comment out
 one assertion and add that case to the debug print function:

Index: emit-rtl.c
===
RCS file: /cvs/gcc/gcc/gcc/emit-rtl.c,v
retrieving revision 1.436
diff -u -r1.436 emit-rtl.c
--- emit-rtl.c  25 Mar 2005 02:23:57 -  1.436
+++ emit-rtl.c  13 Apr 2005 10:41:34 -
@@ -1420,7 +1420,7 @@

   /* ARRAY_REFs, ARRAY_RANGE_REFs and BIT_FIELD_REFs should already
  have been resolved here.  */
-  gcc_assert (DECL_P (expr1));
+  /* gcc_assert (DECL_P (expr1)); */

   /* Decls with different pointers can't be equal.  */
   return 0;
@@ -1449,6 +1449,9 @@
   if (t == NULL_TREE)
 return;

+  if (expr == NULL_TREE)
+expr = t;
+
   type = TYPE_P (t) ? t : TREE_TYPE (t);
   if (type == error_mark_node)
 return;
Index: print-rtl.c
===
RCS file: /cvs/gcc/gcc/gcc/print-rtl.c,v
retrieving revision 1.121
diff -u -r1.121 print-rtl.c
--- print-rtl.c 3 Apr 2005 10:27:44 -   1.121
+++ print-rtl.c 13 Apr 2005 10:41:34 -
@@ -116,6 +116,10 @@
 }
   else if (TREE_CODE (expr) == RESULT_DECL)
 fputs (" ", outfile);
+  else if (!DECL_P(expr))
+{
+  print_node_brief(outfile,"",expr,0);
+}
   else
 {
   fputc (' ', outfile);

For my test cases, this is working.

mfg Martin Kögler
PS: Please CC me on replies


Interprocedural Dataflow Analysis - Scalability issues

2005-04-19 Thread Virender Kashyap
Hi,
   I am working on interprocedural data flow analysis(IPDFA) and need some 
feedback on scalability issues in IPDFA. Firstly since one file is 
compiled at a time, we can do IPDFA only within a file. But that would 
prevent us from doing analysis for funcitons which are called in file 
A , but are defined in some other file B. So even if we do any analysis it 
would give limited advantage. Morever even if we are able to store 
information of  large number of functions, it would cost heavily in 
memory, and threfore non scalable.
So, to what extent can IPDFA be advantageous ?  Or, are there 
solutions to above problems ?

Regards,
Virender.



Re: MIPS, libsupc++ and -G 0

2005-04-19 Thread Jonathan Larmour
Richard Sandiford wrote:
Zack Weinberg <[EMAIL PROTECTED]> writes:
Richard Sandiford <[EMAIL PROTECTED]> writes:
The only reliable way to get what you want is to either (a) add -G0
multilibs or (b) change the default -G setting.  Perhaps a configure
option would be useful here.  Maybe something like --with-sdata-limit,
to go alongside options like --with-arch and --with-tune?
Or perhaps an -m option to put stuff in .sdata as normal, but generate
code as if nothing is in there?

Maybe (and I realise other ports do) but in some ways it gives the worst
of both worlds.  libsupc++ and libstdc++ will end up eating chunks of the
small data area without getting any real benefit from it.
A configure-time option is likely to be more convenient for folks who
use -G0 because you don't have to coerce every build system to add it
on the command line.  And it wouldn't penalise those who want to use
the usual -G8.
Unfortunately the decision to use or not use -G 0 isn't taken on a 
per-compiler basis usually, but a per-application one. A -G 0 multilib 
doesn't seem good either because someone may have an application that 
doesn't link with -G 8, but might with -G 4, and at least get some benefit 
from small data. Or arguably -G 2.

I think for what is in effect a system library, Zack's proposal seems least 
bad, even though it has drawbacks. And potentially no drawbacks if someone 
ever implements linker relaxation for MIPS.

Jifl


Re: The subreg question

2005-04-19 Thread Ling-hua Tseng
James E Wilson wrote:
Ling-hua Tseng wrote:
It's obvious that `movil' and `movim' are only access the partial 
16-bit of the 32-bit register. How can I use RTL expression to 
represent the operations?
As you noticed, within a register, subreg can only be used for low
parts.  You can't ask for the high part of a single register.  If you
have an item that spans multiple registers, e.g. a 64-bit value that is
contained in a register pair, then you can ask for the SImode highpart
of a DImode reg and get valid RTL.  This works because the high part is
an entire register.  This isn't useful to you.
Otherwise, you can access subparts via bitfield insert/extract
operations, or logicals operations (ior/and), though this is likely to
be tedious, and may confuse optimizers.
There are high/lo_sum RTL operators that may be useful to you.  You can use
 (set (reg:SI) (high: ...))
 (set (reg:SI) (lo_sum (reg:SI) (...)))
where the first pattern corresponds to movims, and the second one to
movil.  You could just as well use ior instead of lo_sum for the second
pattern, this is probably better as movil does not do an add.
You may want to emit normal rtl for an SImode move, and then split it
into its two 16-bit parts after reload.  This will avoid confusing RTL
optimizers before reload.
We have vector modes which might be useful to you.  If you say a
register is holding a V4QI mode value, then there are natural ways to
get at the individual elements of the vector via vector operations.
I read the descriptions of (high:m exp) and (lo_sum:m x y) in the gcc 
internal manuls (Section 10.7 and 10.9).
The last line of their descriptions confused me because they wrote "m should be 
Pmode".
Is it really a strict rule?
The RTX "(set (reg:SI xx) (high:m yy))" seems to let m to be an integer mode.
Doesn't it lead to some undefined behaviors in the back-end passes?


libgcc_s.so 3.4 vs 3.0 compatibility

2005-04-19 Thread Peter FELECAN
Currently the libgcc_s.so library has the same version in 3.4 and 4.0,
i.e libgcc_s.so.1 (SONAME = libgcc_s.so.1).

Is this as expected?

Are the 2 libraries compatible? Interchangeable? Looking in the map
file, I don't think so.

If not, how can I make the most correct separation for a 3.4 and 4.0
cohabitation? --- the 2 versions are installed in separate
directories, however, the run-path, used by the dynamic linker,
containing libgcc_s.so is shared.

If someone can give me a pointer toward the relevant documentation I
would be grateful.
-- 
Peter FELECAN



target_shift_truncation_mask for all shifts?!

2005-04-19 Thread Andreas Krebbel
Hi Richard,

I've recently experimented with TARGET_SHIFT_TRUNCATION_MASK
macro and have posted a patch defining it for S/390. 
On S/390 only the least significant six bits of a shift count 
operand are used and I therefore expected the modulo operation 
in the following example to be optimized away:

long
f (long t, unsigned int shift_count)
{
  return t << (shift_count % 64);
}

But I understand that this optimization can't simply be added 
to simplify_rtx.c.
The macro only applies to rtxes coming from named patterns
and not to shifts emitted anywhere else in the optimization process because
combine and co are currently not aware that macro exists. Because 
nobody is able to determine the origin of a shift rtx it would
lead to wrong code performing that kind of optimization in simplify-rtx
for all incoming shifts.

In:
http://gcc.gnu.org/ml/gcc-patches/2004-09/msg00456.html

you proposed to take care of this in the 4.1 (formerly 3.6) timeframe fixing
all places where shift rtxes are generated besides optabs.
Is this still on your todo list?

Bye,

-Andreas-


Re: target_shift_truncation_mask for all shifts?!

2005-04-19 Thread Richard Sandiford
Andreas Krebbel <[EMAIL PROTECTED]> writes:
> In:
> http://gcc.gnu.org/ml/gcc-patches/2004-09/msg00456.html
>
> you proposed to take care of this in the 4.1 (formerly 3.6) timeframe fixing
> all places where shift rtxes are generated besides optabs.
> Is this still on your todo list?

Yes, but so are a lot of things ;)  I can't guarantee that I'll get
to it in the 4.1 timeframe.  At the very least, I'll need to finish
up the .opt stuff first.

If you want to have a crack at it, please do!

Richard


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Andrew Haley
Geoffrey Keating writes:
 > Mark Mitchell <[EMAIL PROTECTED]> writes:
 > 
 > > RC2 is available here:
 > > 
 > >   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
 > > 
 > > As before, I'd very much appreciate it if people would test these bits
 > > on primary and secondary platforms, post test results with the
 > > contrib/test_summary script, and send me a message saying whether or
 > > not there are any regressions, together with a pointer to the results.
 > 
 > Bad news, I'm afraid.

It's a bug in dbxout.  A field is marked as DECL_IGNORED_P, but
dbxout_type_fields() still tries to access it.

Andrew.


2005-04-19  Andrew Haley  <[EMAIL PROTECTED]>

* dbxout.c (dbxout_type_fields): Check DECL_IGNORED_P before
looking at a field's bitpos.


Index: dbxout.c
===
RCS file: /cvs/gcc/gcc/gcc/dbxout.c,v
retrieving revision 1.227
diff -c -6 -p -r1.227 dbxout.c
*** dbxout.c12 Apr 2005 20:39:04 -  1.227
--- dbxout.c19 Apr 2005 13:17:51 -
*** dbxout_type_fields (tree type)
*** 1401,1420 
 return early.  */
if (tem == error_mark_node || TREE_TYPE (tem) == error_mark_node)
return;
  
/* Omit here local type decls until we know how to support them.  */
if (TREE_CODE (tem) == TYPE_DECL
  /* Omit fields whose position or size are variable or too large to
 represent.  */
  || (TREE_CODE (tem) == FIELD_DECL
  && (! host_integerp (bit_position (tem), 0)
  || ! DECL_SIZE (tem)
! || ! host_integerp (DECL_SIZE (tem), 1)))
! /* Omit here the nameless fields that are used to skip bits.  */
!  || DECL_IGNORED_P (tem))
continue;
  
else if (TREE_CODE (tem) != CONST_DECL)
{
  /* Continue the line if necessary,
 but not before the first field.  */
--- 1401,1420 
 return early.  */
if (tem == error_mark_node || TREE_TYPE (tem) == error_mark_node)
return;
  
/* Omit here local type decls until we know how to support them.  */
if (TREE_CODE (tem) == TYPE_DECL
+ /* Omit here the nameless fields that are used to skip bits.  */
+ || DECL_IGNORED_P (tem)
  /* Omit fields whose position or size are variable or too large to
 represent.  */
  || (TREE_CODE (tem) == FIELD_DECL
  && (! host_integerp (bit_position (tem), 0)
  || ! DECL_SIZE (tem)
! || ! host_integerp (DECL_SIZE (tem), 1
continue;
  
else if (TREE_CODE (tem) != CONST_DECL)
{
  /* Continue the line if necessary,
 but not before the first field.  */


CPP inconsistency

2005-04-19 Thread Etienne Lorrain
  Hi,

  Just a minor thing, but I hit this problem times to times, I know
 the CPP preprocessor has no warning like "end of line ignored"...
 GCC-3.3.5 and 3.4.3.

[EMAIL PROTECTED]:~/projet$ cat > tmp.c
#define OPTION1 0x0001
#define OPTION2 0x0002
#define OPTION3 0x0004
#define OPTION4 0x0008

#define CONFIGURATION (OPTION1 | OPTION3)

#if CONFIGURATION & OPTION1
#warning OPTION1 set
#endif

#if CONFIGURATION & OPTION2
#warning OPTION2 set
#endif

#if CONFIGURATION & OPTION3
#warning OPTION3 set
#endif

#if CONFIGURATION & OPTION4
#warning OPTION4 set
#endif

#if !(CONFIGURATION & OPTION2)
#warning OPTION2 unset
#endif

#if (CONFIGURATION & OPTION2) == 0
#warning OPTION2 unset
#endif

// There is the problem: the "== 0" is ignored
#if CONFIGURATION & OPTION2 == 0
#warning OPTION2 unset
#else
#warning OPTION2 set
#endif

[EMAIL PROTECTED]:~/projet$ gcc -E tmp.c
# 1 "tmp.c"
# 1 ""
# 1 ""
# 1 "tmp.c"
tmp.c:9:2: warning: #warning OPTION1 set
tmp.c:17:2: warning: #warning OPTION3 set
tmp.c:25:2: warning: #warning OPTION2 unset
tmp.c:29:2: warning: #warning OPTION2 unset
tmp.c:36:2: warning: #warning OPTION2 set

  Etienne.






__
Découvrez le nouveau Yahoo! Mail : 250 Mo d'espace de stockage pour vos mails ! 
Créez votre Yahoo! Mail sur http://fr.mail.yahoo.com/


Re: CPP inconsistency

2005-04-19 Thread Daniel Jacobowitz
On Tue, Apr 19, 2005 at 03:28:07PM +0200, Etienne Lorrain wrote:
>   Hi,
> 
>   Just a minor thing, but I hit this problem times to times, I know
>  the CPP preprocessor has no warning like "end of line ignored"...
>  GCC-3.3.5 and 3.4.3.

This is just order of operations.

> [EMAIL PROTECTED]:~/projet$ cat > tmp.c
> #define OPTION1 0x0001
> #define OPTION2 0x0002
> #define OPTION3 0x0004
> #define OPTION4 0x0008
> 
> #define CONFIGURATION (OPTION1 | OPTION3)

> // There is the problem: the "== 0" is ignored
> #if CONFIGURATION & OPTION2 == 0
> #warning OPTION2 unset
> #else
> #warning OPTION2 set
> #endif

That's #if (1 | 4) & (2 == 0).  2 != 0, so 5&0 == 0.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: libgcc_s.so 3.4 vs 3.0 compatibility

2005-04-19 Thread Jakub Jelinek
On Tue, Apr 19, 2005 at 02:23:26PM +0200, Peter FELECAN wrote:
> Currently the libgcc_s.so library has the same version in 3.4 and 4.0,
> i.e libgcc_s.so.1 (SONAME = libgcc_s.so.1).
> 
> Is this as expected?

Yes.

> Are the 2 libraries compatible? Interchangeable? Looking in the map
> file, I don't think so.

There is backwards compatibility.  So you want the latest libgcc_s.so.1
from the compilers used to build your applications.

Jakub


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Gareth Elston
Results for i686-pc-cygwin (c, c++, gfortran, objc) are here:
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01363.html
No regressions for c, c++, gfortran relative to RC1.
There are several new tests, which all pass, and one less failed test in 
libstdc++:

26_numerics/cmath/c99_classification_macros_c.cc (test for excess errors)
Gareth.


Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-19 Thread Devang Patel
On Apr 18, 2005, at 9:22 PM, Eric Christopher wrote:
Though of course, this doesn't mean that we can't have an option to
control it.  -Wno-cast-qual doesn't seem like the right choice, as
there is no user cast here.  Maybe something like -Wno-discard-
qual, where -Wdiscard-qual is the default.
I notice that these are pedwarns,
In that case, we can enable it only when -pedantic is used (like many
pedwarns) ?
You could, but in this case it's probably best to fix the code...
One can say this for most of the pedwarns :). I agree with you, but  
customer
is complaining that he is getting 55000+ such warnings.

-
Devang


static inline functions disappear - incorrect static initialiser analysis?

2005-04-19 Thread Daniel Towner
Hi all,
I maintain a port of gcc for a 16-bit VLIW processor. For the last few 
months the port has been based on 3.4.1, but I've just decided to 
upgrade to gcc 4.0.0 branch. I've now got my port to compile with the 
latest code from the branch, and I'm doing regression tests. I've come 
across a problem which I can't figure out. Consider the following code 
fragment:

static inline void fn(void) { }
int main() {
 fn();
 return 0;
}
In the past, if I compiled without optimisation then `fn' would not be 
inlined, and main would contain a call to it. The assembly code for main 
would contain a reference to `fn', which would result in assemble_name 
being called, and would consequently result in `fn' being marked as 
referenced. This in turn would force gcc to emit a body for `fn'.

This no longer appears to be the case? `assemble_name' is still called 
with `fn', and correctly marks `fn' as being referenced, but no body is 
ever emitted. Do I now need to do something more than just calling 
assemble_name to get gcc to output the function body in such a case? I 
notice that the following comment appears in cgraphunit.c, in 
decide_is_function_needed:

 /* ??? If the assembler name is set by hand, it is possible to assemble
the name later after finalizing the function and the fact is noticed
in assemble_name then.  This is arguably a bug.  */
Is this the bug that I am experiencing, and does anyone know how I might 
work around it?

thanks,
dan.

Daniel Towner
picoChip Designs Ltd., Riverside Buildings, 108, Walcot Street, BATH,
BA1 5BG
[EMAIL PROTECTED]
07786 702589 




GCC superblock and region formation support

2005-04-19 Thread Robert Kidd
As a quick introduction, my name is Robert Kidd, and I'm working with 
the Gelato Federation to improve the performance of GCC on Itanium.  In 
particular, I'm looking into improving GCC's superblock support, 
hopefully bringing over some of what we have learned with the IMPACT 
compiler project.

After studying GCC's current tail duplication and extended basic block 
scheduling code, I'm thinking about ways to improve GCC's 
representation.  I'm aiming for a more concrete form than a basic block 
trace.  Steven Bosscher pointed me in the direction of the region 
formation project by Daniel Berlin and Kenneth Zadeck, which sounds 
like a good basis for a superblock representation.  What is the status 
of this project?  Has any documentation or code been released?

Thanks
Robert Kidd
[EMAIL PROTECTED]


Re: libgcc_s.so 3.4 vs 3.0 compatibility

2005-04-19 Thread Peter FELECAN
Jakub Jelinek <[EMAIL PROTECTED]> writes:

> On Tue, Apr 19, 2005 at 02:23:26PM +0200, Peter FELECAN wrote:
> > Currently the libgcc_s.so library has the same version in 3.4 and 4.0,
> > i.e libgcc_s.so.1 (SONAME = libgcc_s.so.1).
> > 
> > Is this as expected?
> 
> Yes.
> 
> > Are the 2 libraries compatible? Interchangeable? Looking in the map
> > file, I don't think so.
> 
> There is backwards compatibility.  So you want the latest libgcc_s.so.1
> from the compilers used to build your applications.

Thank you Jakub.

I should understand that, as usual, there is no "backward"
compatibility. My worry being the case when somebody, having installed
3.4 and 4.0, in this order, updates the previous compiler, 3.4, and
unknowingly replaces the latest library. For this, I think that
additional instrumentations are required by the corresponding
packages. Consequently, it is the role of the packager to assure an
hermetic separation.

-- 
Peter FELECAN
mailto:[EMAIL PROTECTED]


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Andreas Tobler
ppc-linux 32-bit.
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01370.html
Andreas


Re: GCC 4.0 RC1 Available

2005-04-19 Thread Mark Mitchell
Kaveh R. Ghazi wrote:
 >  > 2005-04-12 Paolo Bonzini <[EMAIL PROTECTED]>
 >  > 
 >  > * acx.m4 (ACX_PROG_GNAT): Remove stray break. 
 > 
 > OK for 4.0.0.

Mark,
When this patch went into 4.0, Paolo didn't regenerate the top level
configure, although the ChangeLog claims he did:
http://gcc.gnu.org/ml/gcc-cvs/2005-04/msg00842.html
Would you care to take care of that?  (I am travelling, and don't have 
much time online.)  If so, I'd be very appreciative.

The patch should also be applied to mainline, since the "break"
problem exists there too.  I'm not sure why it wasn't, but perhaps
your "OK for 4.0.0" didn't specify mainline and Paolo was being
conservative.  I think we should fix it there also.
Yes, indeed.  The patch is certainly OK for mainline as well.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC1 Available

2005-04-19 Thread Paolo Bonzini

Would you care to take care of that?  (I am travelling, and don't have 
much time online.)  If so, I'd be very appreciative.
Done.
I'll apply to mainline soon.
Paolo


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Richard Sandiford wrote:
Results for mips-elf are here:
   http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01331.html
and look good.  No regressions.
Thanks; added to the Wiki.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
James E Wilson wrote:
commented onMark Mitchell wrote:
The changes that I anticipate between now and the final release are
(a) documentation changes, (b) a patch for 20991, and (c) a possible
patch for 20973.  Other than that, I will only consider patches that
fix egregious problems, like a fail to bootstrap on a primary
platform.

I put comments in PR 20973.  It seems reasonable for a reload patch, 
i.e. only moderately dangerously unsafe as opposed to severely 
dangerously unsafe.
Thank you for the comments.
However, there is a bigger issue for Itanium.
I agree that this should be fixed, but I do think any fix will be too 
intrusive for 4.0.0.  I'd suggest that we try to fix it for 4.0.1.

Thanks,
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: CPP inconsistency

2005-04-19 Thread Joe Buck

On Tue, Apr 19, 2005 at 03:28:07PM +0200, Etienne Lorrain wrote:

> > #define OPTION1 0x0001
> > #define OPTION2 0x0002
> > #define OPTION3 0x0004
> > #define OPTION4 0x0008
> > #define CONFIGURATION (OPTION1 | OPTION3)
> > // There is the problem: the "== 0" is ignored
> > #if CONFIGURATION & OPTION2 == 0

On Tue, Apr 19, 2005 at 09:33:44AM -0400, Daniel Jacobowitz wrote:
> That's #if (1 | 4) & (2 == 0).  2 != 0, so 5&0 == 0.

Yes, the fact that the bitwise binary operators have a lower
precedence than the relational operators is probably the biggest
botch in C and its derivative languages; I've been C-ing for 20
years, and I still make these kinds of errors from time to time.

GCC has the "suggest parentheses" warning elsewhere (to catch people
writing "if (foo = 0)" and the like; maybe there should be a warning
for this one as well.



Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Andrew Haley wrote:
Geoffrey Keating writes:
 > Mark Mitchell <[EMAIL PROTECTED]> writes:
 > 
 > > RC2 is available here:
 > > 
 > >   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
 > > 
 > > As before, I'd very much appreciate it if people would test these bits
 > > on primary and secondary platforms, post test results with the
 > > contrib/test_summary script, and send me a message saying whether or
 > > not there are any regressions, together with a pointer to the results.
 > 
 > Bad news, I'm afraid.

It's a bug in dbxout.  A field is marked as DECL_IGNORED_P, but
dbxout_type_fields() still tries to access it.
Andrew.
2005-04-19  Andrew Haley  <[EMAIL PROTECTED]>
* dbxout.c (dbxout_type_fields): Check DECL_IGNORED_P before
looking at a field's bitpos.
The C++ front-end (and probably the C front-end) strips zero-width (and 
possibly unnamed) bitfields after class layout.  This can be justified 
in that those bitfields only affect layout; one doesn't need the 
middle-end to copy them around, etc.  So, you could probably fix this in 
the Java front end in the same way.  From your patch, it looks like 
you're letting the back end see these bitfields, and also that their 
DECL_SIZE is not set correctly, which is dangerous in general.

So, I would suggest fixing this in the Java front end.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Geoffrey Keating wrote:
Mark Mitchell <[EMAIL PROTECTED]> writes:

RC2 is available here:
 ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
As before, I'd very much appreciate it if people would test these bits
on primary and secondary platforms, post test results with the
contrib/test_summary script, and send me a message saying whether or
not there are any regressions, together with a pointer to the results.

Bad news, I'm afraid.
On powerpc-darwin8, this fails to bootstrap, with an ICE in libjava (when
trying to build gnu-xml.o).
I've put that information on the Wiki.  However, as powerpc-darwin is 
not a primary platform, and as Java is not part of the release criteria, 
I'm not going to block the release for this problem.  Hopefully it will 
be fixed for 4.0.1.

I'll run another build with a patch applied to disable libgcj on
ppc-darwin, and see how that goes.  I'll also try to work out which
patch broke it.
Thanks.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: libgcc_s.so 3.4 vs 3.0 compatibility

2005-04-19 Thread Joe Buck

On Tue, Apr 19, 2005 at 02:23:26PM +0200, Peter FELECAN wrote:
> > > Currently the libgcc_s.so library has the same version in 3.4 and 4.0,
> > > i.e libgcc_s.so.1 (SONAME = libgcc_s.so.1).
> > > 
> > > Is this as expected?

Jakub Jelinek <[EMAIL PROTECTED]> writes:
> > Yes.

Peter:
> > > Are the 2 libraries compatible? Interchangeable? Looking in the map
> > > file, I don't think so.

Jakub:
> > There is backwards compatibility.  So you want the latest libgcc_s.so.1
> > from the compilers used to build your applications.

Peter:
> I should understand that, as usual, there is no "backward"
> compatibility. My worry being the case when somebody, having installed
> 3.4 and 4.0, in this order, updates the previous compiler, 3.4, and
> unknowingly replaces the latest library. For this, I think that
> additional instrumentations are required by the corresponding
> packages. Consequently, it is the role of the packager to assure an
> hermetic separation.

No, I think you misunderstood Jakub.  If 4.0 is installed second, it will
update the libgcc_s.so.1, and all the apps compiled with either compiler
will continue to work.  You want to avoid the reverse situation, where 4.0
is installed first and then 3.4.


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Eric Botcazou wrote:
SPARC/Solaris is OK:
Thanks; I've added your information to the Wiki.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Andrew Haley
Mark Mitchell writes:
 > Andrew Haley wrote:
 > > Geoffrey Keating writes:
 > >  > Mark Mitchell <[EMAIL PROTECTED]> writes:
 > >  > 
 > >  > > RC2 is available here:
 > >  > > 
 > >  > >   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
 > >  > > 
 > >  > > As before, I'd very much appreciate it if people would test these bits
 > >  > > on primary and secondary platforms, post test results with the
 > >  > > contrib/test_summary script, and send me a message saying whether or
 > >  > > not there are any regressions, together with a pointer to the results.
 > >  > 
 > >  > Bad news, I'm afraid.
 > > 
 > > It's a bug in dbxout.  A field is marked as DECL_IGNORED_P, but
 > > dbxout_type_fields() still tries to access it.
 > > 
 > > 2005-04-19  Andrew Haley  <[EMAIL PROTECTED]>
 > > 
 > >* dbxout.c (dbxout_type_fields): Check DECL_IGNORED_P before
 > >looking at a field's bitpos.
 > 
 > The C++ front-end (and probably the C front-end) strips zero-width (and 
 > possibly unnamed) bitfields after class layout.  This can be justified 
 > in that those bitfields only affect layout; one doesn't need the 
 > middle-end to copy them around, etc.  So, you could probably fix this in 
 > the Java front end in the same way.

Do you mean running through the struct removing such fields from the
list?  OK, I can do that.

 > From your patch, it looks like you're letting the back end see
 > these bitfields, and also that their DECL_SIZE is not set
 > correctly, which is dangerous in general.

I see.  Well, I have just made the change to dbxout.c anyway, and it
is correct, but I'll make the FE cahnge anyway,

 > So, I would suggest fixing this in the Java front end.

I'll see if I can find the C++ front end code you refer to and use it
as a reference.

Andrew.


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Joe Buck wrote:
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01307.html
Thanks.
For sparc-sun-solaris2.8, I get a failure when building the Java compiler,
but I may be doing something wrong, as I usually avoid the Java build
on Solaris (since it takes most of a day to build and test).
Thanks.  Based on the follow-up, and the fact that Java is not 
release-critical, this is not a showstopper.  FWIW, we've seen similar 
problems on HP-UX; there's confusion about which paths will be searched 
by the build compiler vis a vis the compiler we're building.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC superblock and region formation support

2005-04-19 Thread Daniel Berlin
On Tue, 2005-04-19 at 10:17 -0500, Robert Kidd wrote:
> As a quick introduction, my name is Robert Kidd, and I'm working with 
> the Gelato Federation to improve the performance of GCC on Itanium.  In 
> particular, I'm looking into improving GCC's superblock support, 
> hopefully bringing over some of what we have learned with the IMPACT 
> compiler project.
> 
> After studying GCC's current tail duplication and extended basic block 
> scheduling code, I'm thinking about ways to improve GCC's 
> representation.  I'm aiming for a more concrete form than a basic block 
> trace.  Steven Bosscher pointed me in the direction of the region 
> formation project by Daniel Berlin and Kenneth Zadeck, which sounds 
> like a good basis for a superblock representation.  What is the status 
> of this project?  Has any documentation or code been released?

The algorithm used actually comes from an never-to-be published book by
Kenneth Zadeck, Fran Allen, and Barry Rosen, and AFAWK, was never
published.

As such, the algorithm used for region formation is *very heavily*
described in the code (IE the book chapter was more or less copied into
the comments, minus some texisms).


The code is functional but being actively developed, but wasn't large
enough to bother creating a new branch for it.

If you want the current code, i'm sure Kenny would be happy to send it
to you, as he sent it to Steven Bosscher.

My involvement in this is mainly explaining to Kenny how to get done
what he wants to get done in GCC :)


> 
> Thanks
> Robert Kidd
> [EMAIL PROTECTED]
> 



Re: libgcc_s.so 3.4 vs 3.0 compatibility

2005-04-19 Thread Peter FELECAN
Joe Buck <[EMAIL PROTECTED]> writes:

> On Tue, Apr 19, 2005 at 02:23:26PM +0200, Peter FELECAN wrote:
> > > > Currently the libgcc_s.so library has the same version in 3.4 and 4.0,
> > > > i.e libgcc_s.so.1 (SONAME = libgcc_s.so.1).
> > > > 
> > > > Is this as expected?
> 
> Jakub Jelinek <[EMAIL PROTECTED]> writes:
> > > Yes.
> 
> Peter:
> > > > Are the 2 libraries compatible? Interchangeable? Looking in the map
> > > > file, I don't think so.
> 
> Jakub:
> > > There is backwards compatibility.  So you want the latest libgcc_s.so.1
> > > from the compilers used to build your applications.
> 
> Peter:
> > I should understand that, as usual, there is no "backward"
> > compatibility. My worry being the case when somebody, having installed
> > 3.4 and 4.0, in this order, updates the previous compiler, 3.4, and
> > unknowingly replaces the latest library. For this, I think that
> > additional instrumentations are required by the corresponding
> > packages. Consequently, it is the role of the packager to assure an
> > hermetic separation.
> 
> No, I think you misunderstood Jakub.  If 4.0 is installed second, it will
> update the libgcc_s.so.1, and all the apps compiled with either compiler
> will continue to work.  You want to avoid the reverse situation, where 4.0
> is installed first and then 3.4.

Joe,

I understood Jakub. My fingers writing "backward" instead of
"forward" are wrong :-) Anyway, thank you for the comment which
confirms my understanding.

-- 
Peter FELECAN
mailto:[EMAIL PROTECTED]


Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-19 Thread Jeffrey A Law
On Tue, 2005-04-19 at 08:49 +0200, Eric Botcazou wrote:
> > So the combination of the TCB merge plus the pending jump threading
> > changes apparently has ticked a reload bug which manifests itself with
> > the stage1 compiler mis-compiling the stage2 compiler.
> >
> > [...]
> >
> > Which faults because the memory location is actually  read-only memory.
> 
> PR rtl-optimization/15248.
Ah.  Good.  I was having a bloody hard time believing we hadn't run into
this before.

> 
> > What's not clear to me is how best to fix this.
> >
> > We could try to delete all assignments to pseudos which are equivalent
> > to MEMs.
> >
> > We could avoid recording equivalences when the pseudo is set more than
> > once.
> >
> > Other possibilities?
> 
> For 3.3 and 3.4, this was "fixed" by not recording memory equivalences that 
> have the infamous RTX_UNCHANGING_P flag set.
Also a possibility.  Making the equivalent change (!MEM_READONLY_P)
appears to do the trick for mainline.

Jeff



Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joe Buck
On Tue, Apr 19, 2005 at 08:12:05AM +0200, Eric Botcazou wrote:
> > For sparc-sun-solaris2.8, I get a failure when building the Java compiler,
> > but I may be doing something wrong, as I usually avoid the Java build
> > on Solaris (since it takes most of a day to build and test).
> 
> Known glitch.  You have to find out why configure thinks you have libiconv 
> installed and yet the library is not found.

I successfully got it to work by configuring with an explicit
--with-libiconv-prefix flag pointing to a library with iconv in it.
If configure is allowed to find iconv on its own, it messes up.

It's now in the middle of building the Java library.

> > I do have a build report that was generated over the weekend for
> > sparc-sun-solaris2.8 that does not contain Java, it is at
> >
> > http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01245.html
> >
> > with tests for both 32 and 64 bits.  It shows additional failures in
> > 64-bit mode that do not appear in 32-bit mode.
> 
> Binutils 2.15 bug in 64-bit mode.  Didn't you get my message via Binutils' 
> Bugzilla?  The patch is at:
> http://sourceware.org/ml/binutils-cvs/2005-01/msg00019.html

Yes, you sent me a message before when I couldn't build at all, which I
applied, but you pointed me to a different patch:

http://sources.redhat.com/ml/binutils-cvs/2004-09/msg00036.html

This is the patch that http://gcc.gnu.org/install/specific.html#x-x-solaris2
points to.  If an additional patch is needed, install/specific.html
should be updated, and perhaps a single patch that does the whole job
should be made available.



Re: GCC 4.0 RC2 Available

2005-04-19 Thread Richard Kenner
The C++ front-end (and probably the C front-end) strips zero-width
(and possibly unnamed) bitfields after class layout.  This can be
justified in that those bitfields only affect layout; one doesn't need
the middle-end to copy them around, etc.  So, you could probably fix
this in the Java front end in the same way.  From your patch, it looks
like you're letting the back end see these bitfields, and also that
their DECL_SIZE is not set correctly, which is dangerous in general.

So, I would suggest fixing this in the Java front end.

Note that the Ada front-end also has zero-width bitfield (they are of
aggregate type, though, and they are not and cannot be removed (since they
can be referenced).  DECL_SIZE is set correctly, though, but this bug
is not encountered (at least I've never seen it).


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joseph S. Myers
On Mon, 18 Apr 2005, Joe Buck wrote:

> It appears the bug is because there's a libiconv.so in /usr/local/lib on
> that machine, with headers in /usr/local/include, but /usr/local/lib isn't
> in my LD_LIBRARY_PATH.  configure finds the declaration and assumes it
> can call the function.  Sorry, I do most of my work in GNU/Linux these
> days so my Solaris setup has rotted. I'll try that one again with a
> proper LD_LIBRARY_PATH.

The iconv bugs are 7881, 10657, 12596, 18303.  I think the documentation 
will need to have soemthing added about LD_LIBRARY_PATH for libiconv, 
while 18303 is the one to fix first in order to get the configuration 
right; it may be necessary to determine whether the header is system 
iconv.h or from GNU libiconv, and likewise for the library (-liconv ought 
to be GNU libiconv while no special library means using libc) and take 
care to avoid getting GNU iconv.h but libc iconv or GNU libiconv but libc 
iconv.h (the include and linker search paths used in configuration may not 
be consistent if --with-libiconv-prefix hasn't been used).

-- 
Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
[EMAIL PROTECTED] (personal mail)
[EMAIL PROTECTED] (CodeSourcery mail)
[EMAIL PROTECTED] (Bugzilla assignments and CCs)


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Andrew Haley wrote:
Do you mean running through the struct removing such fields from the
list?  OK, I can do that.
Yes.
 > So, I would suggest fixing this in the Java front end.
I'll see if I can find the C++ front end code you refer to and use it
as a reference.
Look in class.c for remove_zero_width_bitfields.  (Spelling might be a 
little off.)

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


register name for DW_AT_frame_base value

2005-04-19 Thread Jerome Guitton

A Dwarf interpretation question:

We have a problem to make GCC-compiled code interact with the HP
native debugger, and it looks like it is caused by the way the
attribute DW_AT_frame_base is interpreted. Apparently, when a frame
pointer can be found in a register, the value generated by GCC for the
DW_AT_frame_base of the corresponding subroutine is a register name
(i.e. DW_OP_reg). Is it really allowed by the dwarf 2
standard? I would tend to think that it is not.

The "definition" of DW_AT_frame_base is:

"A subroutine or entry point entry may also have a DW_AT_frame_base
attribute, whose value is a location description that computes the
frame base for the subroutine or entry point."

OK. So, it is a location description. A location description can be
either a register name or an addressing operation, so at first glance
it is correct. But if the value is a "location description that
computes the frame base", when evaluated it should be an address;
so it cannot be a register name, IMHO.

I think it is even clearer when you read the definition of DW_OP_fbreg:

"The DW_OP_fbreg operation provides a signed LEB128 offset from the
address specified by the location description in the DW_AT_frame_base   
attribute of the current function. (This is typically a stack pointer
register plus or minus some offset. On more sophisticated systems it
might be a location list that adjusts the offset according to changes  
in the stack pointer as the PC changes.)"

I really looks like the behavior of DW_OP_fbreg is not specified when
the value of DW_AT_frame_base is a register name. And the example of
DW_OP_fbreg makes it quite clear, I think:

DW_OP_fbreg -50
Given an DW_AT_frame_base value of OPBREG31 64, this example computes
the address of a local variable that is -50 bytes from a logical frame
pointer that is computed by adding 64 to the current stack pointer
(register 31).

So, when a frame pointer is stored into a register , I would
say that the value of DW_AT_frame_base should be something like
"DW_OP_bregx  0" instead of something like "DW_OP_regx ".

Opinions/thoughts? If we agree on that, I should be able to submit a
patch pretty soon; it should be quite easy to fix (provided that there
is something to fix).

-- 
Jerome


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joe Buck
On Tue, Apr 19, 2005 at 04:20:19PM +, Joseph S. Myers wrote:
> On Mon, 18 Apr 2005, Joe Buck wrote:
> 
> > It appears the bug is because there's a libiconv.so in /usr/local/lib on
> > that machine, with headers in /usr/local/include, but /usr/local/lib isn't
> > in my LD_LIBRARY_PATH.  configure finds the declaration and assumes it
> > can call the function.  Sorry, I do most of my work in GNU/Linux these
> > days so my Solaris setup has rotted. I'll try that one again with a
> > proper LD_LIBRARY_PATH.
> 
> The iconv bugs are 7881, 10657, 12596, 18303.  I think the documentation 
> will need to have soemthing added about LD_LIBRARY_PATH for libiconv, 
> while 18303 is the one to fix first in order to get the configuration 
> right; it may be necessary to determine whether the header is system 
> iconv.h or from GNU libiconv, and likewise for the library (-liconv ought 
> to be GNU libiconv while no special library means using libc) and take 
> care to avoid getting GNU iconv.h but libc iconv or GNU libiconv but libc 
> iconv.h (the include and linker search paths used in configuration may not 
> be consistent if --with-libiconv-prefix hasn't been used).

In my case, the /usr/local copy was GNU libiconv.  --with-libiconv-prefix
was needed to make things work.  I think that autoconf, fiding an iconv
in one of the "standard locations" (/usr/include, /usr/local/include)
assumed that it had determined that there was an iconv in the standard
library.


Re: register name for DW_AT_frame_base value

2005-04-19 Thread Daniel Jacobowitz
On Tue, Apr 19, 2005 at 06:29:27PM +0200, Jerome Guitton wrote:
> 
> A Dwarf interpretation question:
> 
> We have a problem to make GCC-compiled code interact with the HP
> native debugger, and it looks like it is caused by the way the
> attribute DW_AT_frame_base is interpreted. Apparently, when a frame
> pointer can be found in a register, the value generated by GCC for the
> DW_AT_frame_base of the corresponding subroutine is a register name
> (i.e. DW_OP_reg). Is it really allowed by the dwarf 2
> standard? I would tend to think that it is not.

You may want to join the dwarf-discuss list, where this exact same
conversation is taking place - probably about the exact same
interaction.

There have been voices on both sides, but I believe there's a narrow
majority towards allowing the current behavior.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: register name for DW_AT_frame_base value

2005-04-19 Thread Jerome Guitton
Daniel Jacobowitz ([EMAIL PROTECTED]):

> You may want to join the dwarf-discuss list, where this exact same
> conversation is taking place - probably about the exact same
> interaction.

OK, thanks!

-- 
Jerome


Re: register name for DW_AT_frame_base value

2005-04-19 Thread Daniel Berlin
On Tue, 2005-04-19 at 18:29 +0200, Jerome Guitton wrote:
> A Dwarf interpretation question:
> 
> We have a problem to make GCC-compiled code interact with the HP
> native debugger, and it looks like it is caused by the way the
> attribute DW_AT_frame_base is interpreted. Apparently, when a frame
> pointer can be found in a register, the value generated by GCC for the
> DW_AT_frame_base of the corresponding subroutine is a register name
> (i.e. DW_OP_reg). Is it really allowed by the dwarf 2
> standard? I would tend to think that it is not.
> 
> The "definition" of DW_AT_frame_base is:
> 
> "A subroutine or entry point entry may also have a DW_AT_frame_base
> attribute, whose value is a location description that computes the
> frame base for the subroutine or entry point."
> 
> OK. So, it is a location description. A location description can be
> either a register name or an addressing operation, so at first glance
> it is correct. But if the value is a "location description that
> computes the frame base", when evaluated it should be an address;
> so it cannot be a register name, IMHO.

> So, when a frame pointer is stored into a register , I would
> say that the value of DW_AT_frame_base should be something like
> "DW_OP_bregx  0" instead of something like "DW_OP_regx ".

We are having this exact discussion on the DWARF3 standard mailing list
right now (whether fbreg has an implicit dereference, etc).

You should probably take this there.




Re: GCC 4.0 RC2 Available

2005-04-19 Thread Andrew Haley
Andrew Haley writes:
 > Mark Mitchell writes:
 >  > 
 >  > The C++ front-end (and probably the C front-end) strips
 >  > zero-width (and possibly unnamed) bitfields after class layout.
 >  > This can be justified in that those bitfields only affect
 >  > layout; one doesn't need the middle-end to copy them around,
 >  > etc.  So, you could probably fix this in the Java front end in
 >  > the same way.
 > 
 > Do you mean running through the struct removing such fields from the
 > list?  OK, I can do that.

Ah, hold on, this doesn't seem right.

At compile time we don't know the field offset of fields that we
inherit, because it can change at runtime.  So, we don't set the
FIELD_OFFSET, and that is is why dbxout is aborting.

However, these fields are real, and they are used, but we shouldn't
output any debug info for them.  If I were to remove them from the
list of fields they'd have to be recreated because they may be needed
while compiling classes later in the same compilation unit.

I set DECL_IGNORED_P on these fields because I don't want debuginfo to
pay any attention to them.  I could, I suppose, set their offset to
zero or even error_mark_node, which seems to work.

All I want is for FIELD_OFFSET to be "don't know".

Andrew.


Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-19 Thread Paul Schlie
> Eric Botcazou writes:
>> Jeffrey A Law writes:
>> ...
>> Which faults because the memory location is actually  read-only memory.
>
> PR rtl-optimization/15248.
>
>> What's not clear to me is how best to fix this.
>>
>> We could try to delete all assignments to pseudos which are equivalent
>> to MEMs.
>>
>> We could avoid recording equivalences when the pseudo is set more than
>> once.
>>
>> Other possibilities?
>
> For 3.3 and 3.4, this was "fixed" by not recording memory equivalences that
> have the infamous RTX_UNCHANGING_P flag set.

As my understanding is that UNCHANGING is/should-be uniquely associated
with "literal static const data", which may have been declared either via
an explicit "static const" variable declaration, or indirectly as a literal
const value which may be used to initialize non-"static const" variable
values; and who's reference may not survive if all uses of it's declared
value are in-lined as immediate data.

As such all other uses of "UNCHANGING" potentially denoting const variables
are incorrect, and should be fixed; as "const" need not exist in tree data
to denote constant variables, as the trees should already be "correct by
construction", as the language's "front-end" should have prevented any
assignments to declared "const" variables other than initialization from
being constructed. Thereby as none of the optimizations should modify the
semantics of the tree, a const variable will never be assigned a logical
value other than it's designated initializer value, which may either be
spilled and reloaded as any other variable's value may be, or regenerated
by reallocating and reinitializing it if preferred. (Correspondingly all
tree/rtx optimizations which convert a (mem (symb)) => (mem (ptr))
reference must copy/preserve the original mem reference attributes to the
new one, as otherwise they will be improperly lost.)

Thereby MEM_READONLY_P uniquely applies to "literal static const data"
references to enable it's allocation and accesses to be reliably
identifiable as may be required to enable target specific "literal static
const data" allocation and code generation, as may be required if stored
and accessed from a target specific ROM memory region.

i.e.:

static const char s[] = "abc"; // s[4] = {'a','b','c',0} array
   // of "literal static const data"
   // MEM_READONLY_P (mem(symb(s))) == true;

static   char s[] = "abc"; // "C.x[4] = {'a','b','c',0} array
   // of "literal static const data"
   // MEM_READONLY_P (mem(symb(C.x))) == true;
   // s[4] array of char, init with C.x[]
   // MEM_READONLY_P (mem(symb(s))) == false;

   const char s[] = "abc"; // "C.x[4] = {'a','b','c',0} array
   // of "literal static const data"
   // MEM_READONLY_P (mem(symb(C.x))) == true;
   // s[4] array of char, init with C.x[]
   // MEM_READONLY_P (mem(symb(s))) == false;

some-const-char*-funct("abc"); // "C.x[4] = {'a','b','c',0} array
   // of "literal static const data"
   // some-const-char*-funct(C.x);

(Does that seem correct?)





Re: CPP inconsistency

2005-04-19 Thread Zack Weinberg
Joe Buck <[EMAIL PROTECTED]> writes:

> GCC has the "suggest parentheses" warning elsewhere (to catch people
> writing "if (foo = 0)" and the like; maybe there should be a warning
> for this one as well.

I'd be happy to take a patch to add -Wparentheses support to
libcpp/expr.c.

zw


Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-19 Thread Paul Schlie
> From: Paul Schlie <[EMAIL PROTECTED]>
> 
> some-const-char*-funct("abc"); // "C.x[4] = {'a','b','c',0} array
>// of "literal static const data"
>// some-const-char*-funct(C.x);

Or rather I suspect it implies the allocation of a temporary to store
C.x[] into then passing the reference to the temporary (as there seems to
be no present way to define a function parameter which points to a "literal
static const data" object, vs. a generic allocated const object, because
although "static const char s[]" may denote an array of "literal static
const data", funct(static const char *) is interpreted as attempting to
declare the storage class of the pointer parameter, as opposed to qualifying
the storage class of the object it's pointing to)?




Re: front-end tools for preprocessor / macro expansion

2005-04-19 Thread Tom Tromey
> "Henrik" == Henrik Sorensen <[EMAIL PROTECTED]> writes:

Henrik> For the PL/I front-end project (pl1gcc.sourceforge.net), I am
Henrik> just about to begin to add a preprocessor expansion step, and
Henrik> was wondering what other front-end do.

Henrik> My initial thoughts were to create a completely separate
Henrik> program that just do the preprocessing and passes the output
Henrik> to the compiler.

The C preprocessor was initially a standalone executable and was
rewritten to be a library.  I would recommend you just start out this
way.  It is simple to turn a library into a standalone executable, if
that turns out to be desirable, but harder to go the other direction.

Tom


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Tom Tromey
> "Andrew" == Andrew Haley <[EMAIL PROTECTED]> writes:

Andrew> At compile time we don't know the field offset of fields that we
Andrew> inherit, because it can change at runtime.  So, we don't set the
Andrew> FIELD_OFFSET, and that is is why dbxout is aborting.

Andrew> All I want is for FIELD_OFFSET to be "don't know".

In gcjx I'm just generating casts and raw pointer math for this case,
like:

  // Generate *(TYPE *) ((char *) OBJ + OFFSET)

Is it better to generate a COMPONENT_REF instead?

Tom


Re: Heads-up: volatile and C++

2005-04-19 Thread Ken Raeburn
On Apr 18, 2005, at 18:17, Robert Dewar wrote:
Is there anything in the language specifications (mainly C++ in this 
context, but is this an area where C and C++ are going to diverge, or 
is C likely to follow suit?) that prohibits spurious writes to a 
location?
Surely the deal is that spurious writes are allowed unless the
location is volatile. What other interpretation is possible?
That's what I thought.  So, unless the compiler (or language spec) is 
going to become thread-aware, any data to be shared across threads 
needs to be declared volatile, even if some other mechanism (like a 
mutex) is in use to do some synchronization.  Which means performance 
would be poor for any such data.

Which takes me back to: I think the compiler needs to be thread-aware.  
"Enhancing" the meaning of volatile, with the attendant performance 
issues, still doesn't seem adequate to allow for multithreaded 
programming, unless it's used *everywhere*, and performance shoots 
through the floor


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Andrew Haley
Tom Tromey writes:
 > > "Andrew" == Andrew Haley <[EMAIL PROTECTED]> writes:
 > 
 > Andrew> At compile time we don't know the field offset of fields that we
 > Andrew> inherit, because it can change at runtime.  So, we don't set the
 > Andrew> FIELD_OFFSET, and that is is why dbxout is aborting.
 > 
 > Andrew> All I want is for FIELD_OFFSET to be "don't know".
 > 
 > In gcjx I'm just generating casts and raw pointer math for this case,
 > like:
 > 
 >   // Generate *(TYPE *) ((char *) OBJ + OFFSET)

I do that too.  generating code isn't the problem here: it's that
dbxout is generating debug info for types.  I'd rather it didn't do
that, because the info is wrong, but it's not really a problem.

 > Is it better to generate a COMPONENT_REF instead?

No.

This isn't in the code, it's in the type info of the superclass.

Andrew.


Re: The subreg question

2005-04-19 Thread James E Wilson
Ling-hua Tseng wrote:
James E Wilson wrote:
I read the descriptions of (high:m exp) and (lo_sum:m x y) in the gcc 
internal manuls (Section 10.7 and 10.9).
The last line of their descriptions confused me because they wrote "m 
should be Pmode".
A doc bug.  You only need Pmode if you are operating on an address. 
high was originally added for loading addresses on a risc, like a sparc, 
in which case you would have to have Pmode.  However, it has proven 
useful in many other cases.  The sparc.md file for instance has an 
SFmode pattern that uses high/lo_sum, for loading an SFmode constant 
into a general register.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-19 Thread James E Wilson
Devang Patel wrote:
On Apr 18, 2005, at 6:29 PM, James E Wilson wrote:
I notice that these are pedwarns,
In that case, we can enable it only when -pedantic is used (like many  
pedwarns) ?
Consider this small modification to your testcase.
const char *a( void )
{
  return "abc";
}
int main( void )
{
  char *s = a();
  s[0] = 'c';
  return 0;
}
This will core dump when run on any system that write-protects read-only 
data.  The only warning that you will get from gcc is the one that you 
are asking us to disable.  This is the reason for the warning, as it is 
necessary to detect unsafe code like this.

I don't believe this warning should depend on -pedantic, as it is doing 
something useful.  I think a special -Wno- option makes more sense.

I was going to ask why you need this change, but you already answered 
that question.  If I had a customer getting 55k+ warnings, I'd want to 
add an option for them also.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joseph S. Myers
Results for hppa2.0w-hp-hpux11.11, no regressions:

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01379.html

-- 
Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
[EMAIL PROTECTED] (personal mail)
[EMAIL PROTECTED] (CodeSourcery mail)
[EMAIL PROTECTED] (Bugzilla assignments and CCs)


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joe Buck
On Mon, Apr 18, 2005 at 07:44:03AM -0700, Mark Mitchell wrote:
> 
> RC2 is available here:
> 
>   ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/

x86_64-unknown-linux-gnu results (for RHEL v3) are at

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01333.html

The failures are almost all related to nested functions, I've always
seen these same failures on that machine.  So it looks good.



Re: [RFC] warning: initialization discards qualifiers from pointer target type

2005-04-19 Thread Devang Patel
On Apr 19, 2005, at 11:51 AM, James E Wilson wrote:
Devang Patel wrote:
On Apr 18, 2005, at 6:29 PM, James E Wilson wrote:
I notice that these are pedwarns,
In that case, we can enable it only when -pedantic is used (like  
many  pedwarns) ?
Consider this small modification to your testcase.
const char *a( void )
{
  return "abc";
}
int main( void )
{
  char *s = a();
  s[0] = 'c';
  return 0;
}
This will core dump when run on any system that write-protects read- 
only data.  The only warning that you will get from gcc is the one  
that you are asking us to disable.  This is the reason for the  
warning, as it is necessary to detect unsafe code like this.
This makes sense. On the other side, we (Apple) still support - 
fwritable-stings ;)

I don't believe this warning should depend on -pedantic, as it is  
doing something useful.  I think a special -Wno- option makes more  
sense.
OK.
I was going to ask why you need this change, but you already  
answered that question.  If I had a customer getting 55k+ warnings,  
I'd want to add an option for them also.
I'll try to convince customer one more time. If it did not work then  
I'll prepare patch to control this warning using -Wno-discard-qual.

Thanks,
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
-
Devang


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Eric Botcazou
> Yes, you sent me a message before when I couldn't build at all, which I
> applied, but you pointed me to a different patch:

I was talking about a second message.

> If an additional patch is needed, install/specific.html should be updated,
> and perhaps a single patch that does the whole job should be made available.

The second patch is not necessary.  It is only meant to avoid the tons of 
failures you get (it was inadvertently dropped from Binutils 2.15).

-- 
Eric Botcazou


Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-19 Thread Eric Botcazou
> > For 3.3 and 3.4, this was "fixed" by not recording memory equivalences
> > that have the infamous RTX_UNCHANGING_P flag set.
>
> Also a possibility.  Making the equivalent change (!MEM_READONLY_P)
> appears to do the trick for mainline.

Yes, but that's of course not optimal, unnecessary spills are generated.

-- 
Eric Botcazou


Re: internal compiler error at dwarf2out.c:8362

2005-04-19 Thread Björn Haase
Am Dienstag, 19. April 2005 00:30 schrieb James E Wilson:
> Björn Haase wrote:
> > In case that one should not use machine specific atttributes, *is* there
> > a standard way for GCC how to implement different address spaces?
>
> Use section attributes to force functions/variables into different
> sections, and then use linker scripts to place different sections into
> different address spaces.  You can define machine dependent attributes
> as short-hand for a section attribute, and presumably the eeprom
> attribute is an example of that.
>
> The only thing wrong with the eeprom attribute is that it is trying to
> create its own types.  It is not necessary to create new types in order
> to get variables placed into special sections.  There is nothing wrong
> with the concept of having an eeprom attribute.
Hi,

I am aware of the possibility to use section attributes for giving the linker 
information on where to place which kind of object. The difficulty is, that 
for a couple of targets like AVR (and I think that this EEPROM issue for the 
HC05 it is very much the same problem) you will be required to use a 
completely different set of assembler instructions for accessing different 
regions of memory. So it is not sufficient to simply give the linker the 
information where to place which object. 

To give you an example: In case that you are trying to access read-only 
program memory on avr, you could only access it by use of a single pointer 
register (Z reg.) and the only addressing modes that are available are 
register direct and register direct with pre decrement and post increment.

In case that you are accessing r/w data memory, you could make use of 3 
different pointer registers (X,Y,Z) and there are also instruction with 
immediate memory addresses and register direct addressing with offset as well 
as a couple of instructions that could directly reference memory (bit tests, 
e.g.) .

The problem therefore is that the compiler itself would need to know which 
type of memory reference he presently is working on in order to know which 
kind of instruction set will be functional. So, I think what would be needed 
is something that is reflected in the type system. IIUC, this is the 
background of the change in the type system that the previous message in the 
thread is about. 
What would be best is some kind of "sticky flag" that is carried around with 
every tree node or RTL expression that stems from a memory reference that 
once has been marked by a particular attribute.

Yours,

Björn


RFC: ms bitfields of aligned basetypes

2005-04-19 Thread Joern RENNECKE
t001_x of the struct-layout test has such beauties as:
typedef _Bool Tal16bool __attribute__((aligned (16)));
struct S49 { Tal16bool a:1; } ;
.  a only gets BIGGEST_ALIGNMENT (i.e. 64 bits), rather than the 128 bits
required for Tal16bool.  Should we enforce that any storage element 
allocated
for a run of ms-bitfields get the full alignment of the basetype, even 
when it exceeds
the size of the basetype and of BIGGEST_ALIGNMENT?



Re: Reload Issue -- I can't believe we haven't hit this before

2005-04-19 Thread Jeffrey A Law
On Tue, 2005-04-19 at 21:36 +0200, Eric Botcazou wrote:
> > > For 3.3 and 3.4, this was "fixed" by not recording memory equivalences
> > > that have the infamous RTX_UNCHANGING_P flag set.
> >
> > Also a possibility.  Making the equivalent change (!MEM_READONLY_P)
> > appears to do the trick for mainline.
> 
> Yes, but that's of course not optimal, unnecessary spills are generated.
True.  That's one of the reasons why I mentioned the possibility
of finding these insns and simply removing them.

Think about it for a while -- given a SET where the SET_SRC is a 
pseudo which did not get a hard register and is equivalenced to
a read-only memory location, then the SET must be dead as it
can only be setting the memory location to the value already
in the memory location.

jeff



Re: Heads-up: volatile and C++

2005-04-19 Thread Robert Dewar
Ken Raeburn wrote:
That's what I thought.  So, unless the compiler (or language spec) is 
going to become thread-aware, any data to be shared across threads needs 
to be declared volatile, even if some other mechanism (like a mutex) is 
in use to do some synchronization.  Which means performance would be 
poor for any such data.
The use of shared variables without synchronization is rare in any case
in most code.
Which takes me back to: I think the compiler needs to be thread-aware.  
"Enhancing" the meaning of volatile, with the attendant performance 
issues, still doesn't seem adequate to allow for multithreaded 
programming, unless it's used *everywhere*, and performance shoots 
through the floor
I don't see the point here, volatile is exactly intended to deal with
this situation



Re: GCC 4.0 RC2 Available

2005-04-19 Thread Joe Buck
On Tue, Apr 19, 2005 at 09:23:17PM +0200, Eric Botcazou wrote:
> > Yes, you sent me a message before when I couldn't build at all, which I
> > applied, but you pointed me to a different patch:
> 
> I was talking about a second message.

I don't recall seeing it, but then I get a lot of mail.  Sorry if I lost
it.

> > If an additional patch is needed, install/specific.html should be updated,
> > and perhaps a single patch that does the whole job should be made available.
> 
> The second patch is not necessary.  It is only meant to avoid the tons of 
> failures you get (it was inadvertently dropped from Binutils 2.15).

But if these failures are important, shouldn't we be recommending the
second patch to users?



Re: Heads-up: volatile and C++

2005-04-19 Thread Paul Koning
> "Robert" == Robert Dewar <[EMAIL PROTECTED]> writes:

 Robert> Ken Raeburn wrote:
 >> That's what I thought.  So, unless the compiler (or language spec)
 >> is going to become thread-aware, any data to be shared across
 >> threads needs to be declared volatile, even if some other
 >> mechanism (like a mutex) is in use to do some synchronization.
 >> Which means performance would be poor for any such data.

 Robert> The use of shared variables without synchronization is rare
 Robert> in any case in most code.

You mean "without explicit synchronization" via mutexes or the like?

It seems that the classic circular buffer communication mechanisms
have been forgotten by many, but they still work quite well and
require no mutexes or the like.  All that is required is sufficiently
ordered memory accesses. 

 >> Which takes me back to: I think the compiler needs to be
 >> thread-aware.  "Enhancing" the meaning of volatile, with the
 >> attendant performance issues, still doesn't seem adequate to allow
 >> for multithreaded programming, unless it's used *everywhere*, and
 >> performance shoots through the floor

 Robert> I don't see the point here, volatile is exactly intended to
 Robert> deal with this situation

At this point I'm sufficiently confused about what the precise current
definition and proposal are that I don't know if the following is
correct today, will be correct under the proposal, neither, or both.
Anyway...

I'm using circular buffers to communicate between threads.  I drop
data into the buffer and move the "in" pointer; the consumer compares
the pointers, reads data from the buffer, and advances the "out"
pointer.  The current code has the pointers declared as volatile, buf
the buffer data area is not (and I wouldn't want it to be -- that
seems to be a performance issue).

So this relies on volatile references acting as barriers to the
movement of non-volatile references.  In a coherent memory system (or
single CPU platform) that's sufficient; in a non-coherent or weakly
enough ordered system I would add asm("sync") or the like.

   paul



Re: Heads-up: volatile and C++

2005-04-19 Thread Robert Dewar
Paul Koning wrote:
"Robert" == Robert Dewar <[EMAIL PROTECTED]> writes:

 Robert> Ken Raeburn wrote:
 >> That's what I thought.  So, unless the compiler (or language spec)
 >> is going to become thread-aware, any data to be shared across
 >> threads needs to be declared volatile, even if some other
 >> mechanism (like a mutex) is in use to do some synchronization.
 >> Which means performance would be poor for any such data.
 Robert> The use of shared variables without synchronization is rare
 Robert> in any case in most code.
You mean "without explicit synchronization" via mutexes or the like?
It seems that the classic circular buffer communication mechanisms
have been forgotten by many, but they still work quite well and
require no mutexes or the like.  All that is required is sufficiently
ordered memory accesses.
Sure such alogorithms are interesting on mono processors, and that's
EXACTLY the situation in which volatile is appropriate (there is a bit
of confusion in C between what Ada would call volatile and atomic. In
Ada, volatile talks about making sure that reads/writes are to memory,
and atomic is about guaranteeing that access takes a single instruction.
For example, in Ada programming the circular buffer works by making the
buffer itself volatile, and the pointers to the buffer atomic. I am not
sure exactly how this works in C, or whether it is really well defined).
Actually shared variables are quite an interesting topic. If you want
to find out much more about them, have a look at Norman Shulman's NYU
PhD thesis (I was the advisor) which is all about shared variables.
Not sure if it is online somewhere or not. There are a number of
interesting algorithms discussed in this thesis, including the class
of algoritms where different threads work on different parts of a
matrix with overlapped shared variables on the edges.
Nevertheless such usage is relatively rare, and volatile is good enough,
I see no basis for any changes to the compiler.
I'm using circular buffers to communicate between threads.  I drop
data into the buffer and move the "in" pointer; the consumer compares
the pointers, reads data from the buffer, and advances the "out"
pointer.  The current code has the pointers declared as volatile, buf
the buffer data area is not (and I wouldn't want it to be -- that
seems to be a performance issue).
Of course the buffer data area must be declared as volatile, otherwise
the compiler is free to make and hold private copies of elements in
separate threads. It probably won't in practice, but to rely on this
is non-portable.
So this relies on volatile references acting as barriers to the
movement of non-volatile references. 
There is no basis for such reliance
In a coherent memory system (or
single CPU platform) that's sufficient; in a non-coherent or weakly
enough ordered system I would add asm("sync") or the like.
You miss the important point that it is just fine for individual
threads to make local copies of memory variables, e.g. in registers,
and so the reads/writes that you expect to be syncrhonized in this way
just aren't there at all.
   paul



Re: Heads-up: volatile and C++

2005-04-19 Thread Paul Koning
> "Robert" == Robert Dewar <[EMAIL PROTECTED]> writes:

 Robert> Paul Koning wrote:
 >>> "Robert" == Robert Dewar <[EMAIL PROTECTED]> writes:
 >>
 Robert> Ken Raeburn wrote:
 >> >> That's what I thought.  So, unless the compiler (or language
 >> spec) >> is going to become thread-aware, any data to be shared
 >> across >> threads needs to be declared volatile, even if some
 >> other >> mechanism (like a mutex) is in use to do some
 >> synchronization.  >> Which means performance would be poor for any
 >> such data.
 >> 
 Robert> The use of shared variables without synchronization is rare
 Robert> in any case in most code.
 >> You mean "without explicit synchronization" via mutexes or the
 >> like?
 >> 
 >> It seems that the classic circular buffer communication mechanisms
 >> have been forgotten by many, but they still work quite well and
 >> require no mutexes or the like.  All that is required is
 >> sufficiently ordered memory accesses.

 Robert> Sure such alogorithms are interesting on mono processors, ...

They work fine on multi-processors too, of course.  That's where they
first showed up, in CDC 6000 series machines for communication between
CPU and PPU.

 Robert> Nevertheless such usage is relatively rare, and volatile is
 Robert> good enough, I see no basis for any changes to the compiler.

Maybe they are rare because people no longer learn them, even though
they are the most efficient answer in a lot of cases where they end up
using heavier solutions instead.  I remember being amazed that RTLinux
(some years ago) was using interrupt enable/disable to synchronize
control variables in an RTLinux to Linux communication mechanism, when
circular buffers would provide the same service without any need of
interrupt lockout or other explicit synchronization at all.

Also: circular buffers are a VERY common communication mechanism
in DMA devices.  For example, many Ethernet NICs use them.  They may
not be exactly CDC 6000 "CIO" ring buffers, but the basic idea is
similar and the memory access properties needed are analogous.

 >> I'm using circular buffers to communicate between threads.  I drop
 >> data into the buffer and move the "in" pointer; the consumer
 >> compares the pointers, reads data from the buffer, and advances
 >> the "out" pointer.  The current code has the pointers declared as
 >> volatile, buf the buffer data area is not (and I wouldn't want it
 >> to be -- that seems to be a performance issue).

 Robert> Of course the buffer data area must be declared as volatile,
 Robert> otherwise the compiler is free to make and hold private
 Robert> copies of elements in separate threads. It probably won't in
 Robert> practice, but to rely on this is non-portable.

Ok.  I think that may be the issue that people are talking about with
the comments about volatile being a costly solution.

I'm using memcpy() to load/unload buffer data.  Is memcpy defined for
volatile variables?  If yes, and I declare the buffer as a char array
(logical, since the threads can pass arbitrary quantities of bytes),
would the generated code be required by the semantics of "volatile" to
do individual loads and stores for every byte?  That is the problem.
What's needed is (a) no code movement across the pointer operations,
(b) the data copy can be fully optimized, as is typically done with
memcpy, (c) no old data hangs around in registers.

I guess GCC lets us do this with asm ("":::"memory") but of course
that isn't portable either...

 paul



Re: line-map question

2005-04-19 Thread Per Bothner
Devang Patel wrote:
 From line_map comment at (libcpp/include/line-map.h)
/* Physical source file TO_FILE at line TO_LINE at column 0 is  represented
   by the logical START_LOCATION.  TO_LINE+L at column C is  represented by
   START_LOCATION+(L*(1<
What happens when column number is >= 128 ?
The exact same rule applies.  There is nothing magic about 128, except
it's an initial value that hopefully covers > 90% of the case.  If it
doesn't, the code will increase column_bits, either by allocating a
new line_map, or (in a few cases) tweaking the existing one.  The
latter is what is happening in PR 20907.
> This is PR 20907.
I have a patch I'll check in after testing.
--
--Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Eric Botcazou
> I don't recall seeing it, but then I get a lot of mail.  Sorry if I lost
> it.

No problem, I only wanted to check.

> But if these failures are important, shouldn't we be recommending the
> second patch to users?

It's 64-bit STABS and nobody uses 64-bit STABS (as generated by GCC).
As an alternative, I could probably disable STABS for the 64-bit compiler.

-- 
Eric Botcazou


Re: i386 stack slot optimisation

2005-04-19 Thread James E Wilson
Øyvind Harboe wrote:
How does the i386 backend optimise the stack slot assignment to minimize
the displacement offset?
We don't.  We just assign sequential addresses as we allocate stack slots.
; -O0 => large offset
leal8268(%esp), %eax
incl(%eax)
; -O3 => small offset
incl40(%esp)
-O3 enables function inlining.  With function inlining, we can see that 
the arrays are unused and we optimize them away.  Without the arrays, 
the stack frames are small, and hence you get small offsets.

FRAME_GROWS_DOWNWARD has little effect on frame offsets.  The internal 
frame layout has more of an effect.  Almost all targets grow frames in 
the same direction as the stack, and since the stack grows downwards 
here, the frame grows downwards.  That is just the natural direction of 
growth for the frame.

This thread has a stack slot assignment optimisation patch that has
never been committed to GCC CVS, but the above indicats that there is
some sort of mechanism in GCC already to mitigate this problem...
http://gcc.gnu.org/ml/gcc-patches/2003-01/msg00019.html
Yes, this was a real attempt to optimize frame sizes.  The existing 
scheme referred to in that thread is no stack slot assignment 
optimization at all.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: i386 stack slot optimisation

2005-04-19 Thread Øyvind Harboe
> -O3 enables function inlining.  With function inlining, we can see that 
> the arrays are unused and we optimize them away.  Without the arrays, 
> the stack frames are small, and hence you get small offsets.

The external functions in my example using the arrays ensures that the
arrays are not optimized away. 

> FRAME_GROWS_DOWNWARD has little effect on frame offsets.

The registers that are spilled are allocated place in the stack frame
last, so when FRAME_GROWS_DOWNWARD, the displacement offsets are
smaller for spilled registers than with the stack frame growing upwards.

Spilled registeres are accessed more frequently than other entries in
the stack frame, so getting this right(which I think pretty much all
backends do, after a quick scan) has a significant positive impact on
code-size.


-- 
Øyvind Harboe
http://www.zylin.com



Unnesting of nested subreg expressions

2005-04-19 Thread Björn Haase
Hi,

when working on removing avr's present monolithic SI-mode instruction patterns 
by splitters after reload and lowering to QI modes after expand, I have 
stepped over the following general issue:

The mid-end seems not to be able to simplify nested subreg expressions. I.e. 
it seems that there is no known transformation 

   (subreg:QI (subreg:HI (reg:SI xx) 0) 0) 
-> (subreg:QI (reg:SI xx) 0)

. I have stepped over the problem when replacing the avr-target's present 
xorsi3 define_insn by a corresponding define_expand explicitly using 4 
subregs, i.e. after replacing

(define_insn "xorhi3"
  [(set (match_operand:HI 0 "register_operand" "=r")
(xor:SI (match_operand:HI 1 "register_operand" "%0")
(match_operand:HI 2 "register_operand" "r")))]
  ""
  "eor %0,%2
eor %B0,%B2"
  [(set_attr "length" "2")
   (set_attr "cc" "set_n")])

by

(define_expand "xorhi3"
 [(set (subreg:QI (match_operand:HI 0 "register_operand" "=r") 0)
   (xor:QI (subreg:QI (match_operand:HI 1 "register_operand" "%0") 0)
   (subreg:QI (match_operand:HI 2 "register_operand" "r")  0)))
  (set (subreg:QI (match_dup 0) 1)
   (xor:QI (subreg:QI (match_dup 1) 1)
   (subreg:QI (match_dup 2) 1)))]
  ""
  "")
 
So far I had seen no regressions on the testsuite, however after adapting the 
testcase gcc.c-torture/execute/200406029-1.c to compile also on int=16bits 
targets, I am now getting an ICE. The error message reads:

/home/bmh/gnucvs/head/gcc/gcc/testsuite/gcc.c-torture/execute/20040629-1.c:139: 
error: unrecognizable insn:
(insn 28 27 29 0 (set (subreg:QI (reg:HI 59) 0)
(xor:QI (subreg:QI (reg:HI 42) 0)
(subreg:QI (subreg:HI (reg/v:SI 41 [ x ]) 0) 0))) -1 
(insn_list:REG_DEP_TRUE 68 (insn_list:REG_DEP_TRUE 3 (insn_list:REG_DEP_TRUE 
27 (nil
(nil))
/home/bmh/gnucvs/head/gcc/gcc/testsuite/gcc.c-torture/execute/20040629-1.c:139: 
internal compiler error: in extract_insn, at recog.c:2082
Please submit a full bug report,
with preprocessed source if appropriate.

The 20040629-1.c's bitfield operations generate HI mode subregs of SI mode 
registers and these HI mode subregs are themselves passed to the HI->QI mode 
expander. My question therefore is: 

It seems that the cleanest solution would be to teach gcc how to unnest 
subregs. Therefore my question: Is this possible and where would be the place 
for doing this?
 
Yours,

Björn


BTW. I have stepped over a similar issue when using the gen_highpart and 
gen_lowpart functions for splitters after reload. It sometimes happens that 
one of these functions also gets a subreg expression as input operand while 
not being able to handle it. Both functions seem to fail as well when they 
are working on a label reference immediate operand. It seems that in their 
present form gen_lowpart and gen_highpart should be used only in DI-SI-mode 
splitters since then there is no danger that the DI mode expression itself is 
a subreg of an even larger mode.


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Joe Buck wrote:
On Mon, Apr 18, 2005 at 07:44:03AM -0700, Mark Mitchell wrote:
RC2 is available here:
 ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/

x86_64-unknown-linux-gnu results (for RHEL v3) are at
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01333.html
The failures are almost all related to nested functions, I've always
seen these same failures on that machine.  So it looks good.
Thanks!  I've added that to the Wiki.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Joseph S. Myers wrote:
Results for hppa2.0w-hp-hpux11.11, no regressions:
http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01379.html
Thanks; posted on the Wiki.
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Mark Mitchell
Andrew Haley wrote:
At compile time we don't know the field offset of fields that we
inherit, because it can change at runtime.  So, we don't set the
FIELD_OFFSET, and that is is why dbxout is aborting.
OK.  I certainly can't claim that this aspect of the GCC IR is 
particularly well specified.  For example, whether or not derived 
classes should include copies of FIELD_DECLs from base classes is not 
something that I think has been written down anywhere.  We don't do that 
in C++ (instead, we have FIELD_DECLs with the type of the base class), 
but I don't think that would help in your situation, as you still 
wouldn't know what offset these FIELD_DECLs would have.

However, these fields are real, and they are used, but we shouldn't
output any debug info for them.  If I were to remove them from the
list of fields they'd have to be recreated because they may be needed
while compiling classes later in the same compilation unit.
OK.
I set DECL_IGNORED_P on these fields because I don't want debuginfo to
pay any attention to them.  I could, I suppose, set their offset to
zero or even error_mark_node, which seems to work.
I think NULL_TREE is a fine value to represent "unknown" -- the fact 
that it's likely to cause crashes is probably a feature, in that any 
parts of the compiler that go trying to use the field will probably be 
found more quickly.  So, your original patch is fine for mainline.  It's 
also OK for 4.0.1, after 4.0.0 is out.

Thanks,
--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: i386 stack slot optimisation

2005-04-19 Thread James E Wilson
Øyvind Harboe wrote:
The external functions in my example using the arrays ensures that the
arrays are not optimized away.
Ah, right, stupid mistake on my part.
The registers that are spilled are allocated place in the stack frame
last, so when FRAME_GROWS_DOWNWARD, the displacement offsets are
smaller for spilled registers than with the stack frame growing upwards.
I see what I missed the first time.  Without optimization, we have stack 
locals, which get allocated early.  With optimization, we have 
pseudo-regs which get spilled to the stack, which get allocated late. 
So you are right, it is FRAME_GROWS_DOWNWARD that caused the behaviour 
you saw.

However, I would not call this an optimization.  This is just how the 
toolchain accidentally happens to work.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: Interprocedural Dataflow Analysis - Scalability issues

2005-04-19 Thread Daniel Berlin
On Tue, 2005-04-19 at 15:36 +0530, Virender Kashyap wrote:
> Hi,
> I am working on interprocedural data flow analysis(IPDFA) and need some 
> feedback on scalability issues in IPDFA. Firstly since one file is 
> compiled at a time, we can do IPDFA only within a file. 

For starters, we're working on this.

> But that would 
> prevent us from doing analysis for funcitons which are called in file 
> A , 
> but are defined in some other file B. 



You just have to make conservative assumptions, of course.

You almost *never* have the whole program at once, except in
benchmarks :)

> So even if we do any analysis it 
> would give limited advantage.

This isn't necessarily true, it depends on the analysis.

>  Morever even if we are able to store 
> information of  large number of functions, it would cost heavily in 
> memory, and threfore non scalable.

Uh, not necessarily.
It depends on what you are storing.

>  So, to what extent can IPDFA be advantageous ?  Or, are there 
> solutions to above problems ?

Yes, summaries, conservative assumptions about functions we can't see,
etc.


> 
> Regards,
> 
> Virender.
> 
> 



Re: GCC 4.0 RC2 Available

2005-04-19 Thread Josh Conner
On Apr 18, 2005, at 3:12 PM, Julian Brown wrote:
Results for arm-none-elf, cross-compiled from i686-pc-linux-gnu 
(Debian)
for C and C++ are here:

http://gcc.gnu.org/ml/gcc-testresults/2005-04/msg01301.html
Relative to RC1, there are several new tests which pass, and:
g++.dg/warn/Wdtor1.C (test for excess errors)
works whereas it didn't before.
Julian
Has there been any discussion of builtin-bitops-1.c failing for 
arm-none-elf with RC1 + RC2?  It's a moderately serious regression from 
3.4.3 -- we're no longer correctly calculating the size of load double 
instructions, and so the constant pool is generated at an address that 
is out of range of the instructions that use them.  It will show up in 
moderately large functions that use double word constants.

The problem arises because the function const_double_needs_minipool, 
which decides whether to use a constant pool or do an inline 
calculation of a constant, bases its decision on fairly sophisticated 
logic (from arm_gen_constant), for example figuring that a constant 
like 0xcafecafe can be calculated in 4 instructions:
	mov r0, #0
	mov r1, #51712		; 0xca00
	add	r1, r1, #254		; 0xcafe
	orr	r1, r1, r1, asl #16	; 0xcafecafe

But this code never gets generated, because the instruction pattern 
arm_movdi actually uses a much simpler algorithm (from the function 
output_move_double), which takes 5 instructions for this same example:
	mov	r0, #0
	mvn	r1, #1
	bic	r1, r1, #13568
	bic	r1, r1, #65536
	bic	r1, r1, #889192448

Which means that the "length" attribute for arm_movdi with these 
addressing modes should more accurately be 20 instead of 16.

Richard Earnshaw fixed this behavior in the mainline on 8 April:
	http://gcc.gnu.org/ml/gcc-patches/2005-04/msg00850.html
by using the more efficient constant generator function 
(arm_gen_constant) when generating double-word constants.   However, 
this was not applied to the 4.0 release branch.

Has this regression been discussed already (if so, sorry - I did search 
the archives)?  If not, is it worth considering applying Richard's 
patch, or even a simpler one:

*** arm.md  Wed Feb 16 13:57:10 2005
--- arm.md  Tue Apr 19 17:09:47 2005
***
*** 4167,4173 
"*
return (output_move_double (operands));
"
!   [(set_attr "length" "8,12,16,8,8")
 (set_attr "type" "*,*,*,load2,store2")
 (set_attr "pool_range" "*,*,*,1020,*")
 (set_attr "neg_pool_range" "*,*,*,1008,*")]
--- 4167,4173 
"*
return (output_move_double (operands));
"
!   [(set_attr "length" "8,12,20,8,8")
 (set_attr "type" "*,*,*,load2,store2")
 (set_attr "pool_range" "*,*,*,1020,*")
 (set_attr "neg_pool_range" "*,*,*,1008,*")]
Thanks -
Josh

Josh Conner


Java field offsets [was; GCC 4.0 RC2 Available]

2005-04-19 Thread Per Bothner
Andrew Haley wrote:
However, these fields are real, and they are used, but we shouldn't
output any debug info for them.
Does Dwarf support "computed field offsets"?  (This might be needed
for Ada, to.)  If so, the Right Thing might be to emit DIEs so gdb
can calculate the field offsets, mimicing the normal "indirect
dispatch".
Not to say this is worth doing, even if possible. It would probably
be a lot of work to have gdb understand computed offsets, and unless
it is needed for something else, it's not worth it for Java.
That is because we want a solution that also works for dynamically
loaded interpreted classes, and the solution is to get the offsets
from the run-time data structures, rather than the debug information.
There is some partially-bit-rotted code in gdb to extract type
information from run-time Class information, but it was fragile
because it didn't fit well with gdb's obstack-based memory management.
The situation might be different now.
OTOH, just like we now use Dwarf2 unwind-info for exception handling,
perhaps we could use Dwarf debug information for reflection information.
--
--Per Bothner
[EMAIL PROTECTED]   http://per.bothner.com/


Re: GCC 4.0 RC2 Available

2005-04-19 Thread Geoff Keating
On 19/04/2005, at 6:24 AM, Andrew Haley wrote:
Geoffrey Keating writes:
Mark Mitchell <[EMAIL PROTECTED]> writes:
RC2 is available here:
  ftp://gcc.gnu.org/pub/gcc/prerelease-4.0.0-20050417/
As before, I'd very much appreciate it if people would test these 
bits
on primary and secondary platforms, post test results with the
contrib/test_summary script, and send me a message saying whether or
not there are any regressions, together with a pointer to the 
results.
Bad news, I'm afraid.
It's a bug in dbxout.  A field is marked as DECL_IGNORED_P, but
dbxout_type_fields() still tries to access it.
This patch works for me.
Andrew.
2005-04-19  Andrew Haley  <[EMAIL PROTECTED]>
* dbxout.c (dbxout_type_fields): Check DECL_IGNORED_P before
looking at a field's bitpos.



smime.p7s
Description: S/MIME cryptographic signature


Re: GCC 4.0 RC1 Available

2005-04-19 Thread Kaveh R. Ghazi
 >  > Would you care to take care of that? (I am travelling, and don't have
 >  > much time online.) If so, I'd be very appreciative. 

Sure but... 

 > Done.
 > I'll apply to mainline soon.
 > Paolo

Aleady done.
Thanks Paolo! :-)

--
Kaveh R. Ghazi  [EMAIL PROTECTED]


Re: 2 suggestions

2005-04-19 Thread Kaveh R. Ghazi
 > On Thu, 14 Apr 2005, Kaveh R. Ghazi wrote:
 > > I guess "x" is fine with me.  However can we use "x" only in the
 > > anchor and not the link's text label?  E.g.:
 > > 
 > >alpha*-*-*
 > > 
 > > That way, the part people actually read in the document still uses
 > > asterisk that they are used to seeing.
 > 
 > Your wish is my command.  Patch proposal below for comments
 > Gerald
 > 
 > 2005-04-14  Gerald Pfeifer  <[EMAIL PROTECTED]>
 > 
 >  * doc/install.texi: Avoid using asterisks in @anchor names.
 >Remove i?86-*-esix from platform directory.
 >Remove powerpc-*-eabiaix from platform directory.

Thanks Gerald, it propagated to the website and works/looks great!

--
Kaveh R. Ghazi  [EMAIL PROTECTED]


Re: Interprocedural Dataflow Analysis - Scalability issues

2005-04-19 Thread Dan Kegel
Daniel Berlin wrote:
I am working on interprocedural data flow analysis(IPDFA) and need some 
feedback on scalability issues in IPDFA. Firstly since one file is 
compiled at a time, we can do IPDFA only within a file. 
For starters, we're working on this.
(I was curious, so I searched a bit.  It looks like
gcc-4.0 supports building parts of itself in this mode?
Though only C and Java stuff right now, not C++.
Related keywords are
--enable-intermodule (see the thread 
http://gcc.gnu.org/ml/gcc-patches/2003-07/msg01146.html)
--enable-libgcj-multifile (see 
http://gcc.gnu.org/ml/java-patches/2003-q3/msg00658.html)
and IMI.  It seems that just listing multiple source files
on the commandline is enough to get it to happen?)
But that would 
prevent us from doing analysis for funcitons which are called in file 
A, but are defined in some other file B. 
You just have to make conservative assumptions, of course.
You almost *never* have the whole program at once, except in
benchmarks :)
True, but hey, if you really need that one server to run
fast, you might actually feed the whole program to the
compiler at once.  Or at least a big part of it.
 Morever even if we are able to store 
information of  large number of functions, it would cost heavily in 
memory, and threfore non scalable.
Uh, not necessarily.
Speaking as a user, it's ok if whole-program optimization takes more memory
than normal compilation.   (Though you may end up needing
a 64 bit processor to use it on anything really big.)
- Dan
--
Trying to get a job as a c++ developer?  See 
http://kegel.com/academy/getting-hired.html


Re: Whirlpool oopses in 2.6.11 and 2.6.12-rc2

2005-04-19 Thread Denis Vlasenko
On Tuesday 19 April 2005 20:40, Chris Wright wrote:
> * Denis Vlasenko ([EMAIL PROTECTED]) wrote:
> > On Tuesday 19 April 2005 08:42, Denis Vlasenko wrote:
> > > modprobe tcrypt hangs the box on both kernels.
> > > The last printks are:
> > > 
> > > 
> > > 
> > > testing wp384
> > > NNUnable to handle kernel paging request at virtual address eXXX
> > > 
> > > Nothing is printed after this and system locks up solid.
> > > No Sysrq-B.
> > > 
> > > IIRC, 2.6.9 was okay.
> > 
> > Update: it does not oops on another machine. CPU or .config related,
> > I'll look into it...
> 
> Any update?  This is candidate for -stable fixing if it's an actual bug.

Yes. wp512_process_buffer() was using 3k of stack if compiled with -O2.
The wp512.c I appended (sans table at top) is instrumented to show it.
Use "make crypto/wp512.s".

This is a suboptimal code generation by gcc, so I CC-ing
gcc list for comments.

Note that -Os compiled one (CONFIG_CC_OPTIMIZE_FOR_SIZE=y)
does not have stack overflow problem and is significantly smaller, too.
--
vda

/**
 * The core Whirlpool transform.
 */

static void wp512_process_buffer(struct wp512_ctx *wctx) {
int i, r;
u64 K[8];/* the round key */
u64 block[8];/* mu(buffer) */
u64 state[8];/* the cipher state */
u64 L[8];

for (i = 0; i < 8; i++) {
block[i] = be64_to_cpu( ((__be64*)wctx->buffer)[i] );
}

state[0] = block[0] ^ (K[0] = wctx->hash[0]);
state[1] = block[1] ^ (K[1] = wctx->hash[1]);
state[2] = block[2] ^ (K[2] = wctx->hash[2]);
state[3] = block[3] ^ (K[3] = wctx->hash[3]);
state[4] = block[4] ^ (K[4] = wctx->hash[4]);
state[5] = block[5] ^ (K[5] = wctx->hash[5]);
state[6] = block[6] ^ (K[6] = wctx->hash[6]);
state[7] = block[7] ^ (K[7] = wctx->hash[7]);


// gcc optimizer bug: first method is noticeably
// worse than second: loads full u32, shifts and
// zero-extends low u8 to u32
#if 0
 #define BYTE7(v) ((u8)((v) >> 56))
 #define BYTE6(v) ((u8)((v) >> 48))
 #define BYTE5(v) ((u8)((v) >> 40))
 #define BYTE4(v) ((u8)((v) >> 32))
 // gcc optimizer bug: without (u32) below will emit
 // spurious shrd insns
 #define BYTE3(v) ((u8)((u32)(v) >> 24))
 #define BYTE2(v) ((u8)((u32)(v) >> 16))
 #define BYTE1(v) ((u8)((u32)(v) >>  8))
 #define BYTE0(v) ((u8)(v))
#else
// little-endian
 #define BYTE7(v) (((u8*)&v)[7])
 #define BYTE6(v) (((u8*)&v)[6])
 #define BYTE5(v) (((u8*)&v)[5])
 #define BYTE4(v) (((u8*)&v)[4])
 #define BYTE3(v) (((u8*)&v)[3])
 #define BYTE2(v) (((u8*)&v)[2])
 #define BYTE1(v) (((u8*)&v)[1])
 #define BYTE0(v) (((u8*)&v)[0])
#endif

// gcc -O2 optimizer bug: second method
// causes excessive spills (~3K stack used)
#if 1
 #define X(a) a ^=
 #define XEND ;
#else
 #define X(a) ^
 #define XEND
#endif
for (r = 1; r <= WHIRLPOOL_ROUNDS; r++) {
asm("#1");
L[0]  = C0[BYTE7(K[0])] XEND
X(L[0]) C1[BYTE6(K[7])] XEND
X(L[0]) C2[BYTE5(K[6])] XEND
X(L[0]) C3[BYTE4(K[5])] XEND
X(L[0]) C4[BYTE3(K[4])] XEND
X(L[0]) C5[BYTE2(K[3])] XEND
X(L[0]) C6[BYTE1(K[2])] XEND
X(L[0]) C7[BYTE0(K[1])] XEND
X(L[0]) rc[r];
asm("#2");

L[1]  = C0[BYTE7(K[1])] XEND
X(L[1]) C1[BYTE6(K[0])] XEND
X(L[1]) C2[BYTE5(K[7])] XEND
X(L[1]) C3[BYTE4(K[6])] XEND
X(L[1]) C4[BYTE3(K[5])] XEND
X(L[1]) C5[BYTE2(K[4])] XEND
X(L[1]) C6[BYTE1(K[3])] XEND
X(L[1]) C7[BYTE0(K[2])];

L[2]  = C0[BYTE7(K[2])] XEND
X(L[2]) C1[BYTE6(K[1])] XEND
X(L[2]) C2[BYTE5(K[0])] XEND
X(L[2]) C3[BYTE4(K[7])] XEND
X(L[2]) C4[BYTE3(K[6])] XEND
X(L[2]) C5[BYTE2(K[5])] XEND
X(L[2]) C6[BYTE1(K[4])] XEND
X(L[2]) C7[BYTE0(K[3])];

L[3]  = C0[BYTE7(K[3])] XEND
X(L[3]) C1[BYTE6(K[2])] XEND
X(L[3]) C2[BYTE5(K[1])] XEND
X(L[3]) C3[BYTE4(K[0])] XEND
X(L[3]) C4[BYTE3(K[7])] XEND
X(L[3]) C5[BYTE2(K[6])] XEND
X(L[3]) C6[BYTE1(K[5])] XEND
X(L[3]) C7[BYTE0(K[4])];

L[4]  = C0[BYTE7(K[4])] XEND
X(L[4]) C1[BYTE6(K[3])] XEND
X(L[4]) C2[BYTE5(K[2])] XEND
X(L[4]) C3[BYTE4(K[1])] XEND
X(L[4]) C4[BYTE3(K[0])] XEND
X(L[4]) C5[BYTE2(K[7])] XEND
X(L[4]) C6[BYTE1(K[6])] XEND
X(L[4]) C7[BYTE0(K[5])];

L[5]  = C0[BYTE7(K[5])] XEND
X(L[5]) C1[BYTE6(K[4])] XEND
X(L[5]) C2[BYTE5(K[3])] XEND
X(L[5]) C3[BYTE4(K[2])] XEND
X(L[5]) C4[BYTE3(K[1])] XEND
X(L[5]) C5[BYTE2(K[0])] XE

Re: i386 stack slot optimisation

2005-04-19 Thread Øyvind Harboe
> I see what I missed the first time.  Without optimization, we have stack 
> locals, which get allocated early.  With optimization, we have 
> pseudo-regs which get spilled to the stack, which get allocated late. 
> So you are right, it is FRAME_GROWS_DOWNWARD that caused the behaviour 
> you saw.
> 
> However, I would not call this an optimization.  This is just how the 
> toolchain accidentally happens to work.

Without this emergent behaviour, stack slot assignment optimisation
would have been much more important.


-- 
Øyvind Harboe
http://www.zylin.com