Re: #pragma support to guide autovectorizer

2005-08-02 Thread Dorit Naishlos




> I was wondering if any addition work had been completed toward pragma
> support for the autovectorization branch (see
> http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01560.html)?
>

I think Devang was planning to continue this work - I'm not sure where it
stands

dorit

>  Thanks..
>
>   Chad Rosier
>



Re: does the instruction combiner regards (foo & 0xff) as a special case?

2005-08-02 Thread ibanez




You are cool, now I found a

(set (reg:CC_Z 33 cc)
(compare:CC_Z (zero_extend:SI (subreg:QI (reg/v:SI 166 [ a ]) 0))
(const_int 0 [0x0])))

It's what I'm looking for.

Thank you so much.



Re: Large, modular C++ application performance ...

2005-08-02 Thread michael meeks

On Mon, 2005-08-01 at 14:18 +0200, Steven Bosscher wrote:
> On Monday 01 August 2005 11:44, michael meeks wrote:
> > However - the log(s) term is rather irrelevant to my argument :-)
> 
> Not really.  Maybe the oprofile results for the linker show that the
> behavior is worse, or maybe better - who knows :-)
> Have you looked at any profiles btw?  Just for the curious...

Yes - identifying the linker and relocation processing as the root
cause of the problem isn't just a stab in the dark :-)

This flgas up as the no.1 (individual) performance killer with whatever
profiling tools you use eg.:

* vtune
* speedprof
* instrumenting top/tail of dlopen calls

etc. :-)

Regards,

Michael.

-- 
 [EMAIL PROTECTED]  <><, Pseudo Engineer, itinerant idiot



Re: Large, modular C++ application performance ...

2005-08-02 Thread michael meeks
Hi H.J.,

On Mon, 2005-08-01 at 08:55 -0700, H. J. Lu wrote:
> > -fvisibility is helpful - as the paper says, not as helpful as the old
> > -Bsymbolic (or link maps exposing only 3 or so functions) were. However
> > - -fvisibility can only help so much - if you have:
>
> Since you were comparing Windows vs. ELF, doesn't Windows need a file
> to define which symbols to export for a shared library ?

Apparently so - here is my (fragementary) understanding of that -
Martin - please do correct me. OO.o builds the .defs on Win32 with a
custom tool called 'ldump4'. That (interestingly) goes groping in some
binary file format, reads the symbol table, groks symbols tagged with
'EXPORT:', and builds a .def file. ie. it *looks* like it's automated,
and can uses the API marked (__dllexport etc.) where appropriate.

>  Why can't you you do it with ELF using a linker map? Libstdc++.so is
> built with a linker map. Any C++ shared library should use one if the
> startup time is a big concern. Of coursee, if gcc can generate a list
> of symbols suitable for linker map, which needs to be exported, it will
> be very helpful. I don't think it will be too hard to implement.

So - the thing about linker maps (cf. the ldump4 tool) is that they
tend to be hard to maintain, not portable across platforms, a source of
grief and problems etc. ;-) [ we have several strata of old, now defunct
link maps lying around from previous investments of effort that
subsequently became useless ].

As I recall, I saw a suggestion (from you I think), for a new
visibility attribute 'export' or somesuch, that would resolve names
internally to the library, while still exporting the symbols.

That would suit our needs beautifully - if, when used to annotate a
class, it would allow the various typeinfo / vague-linkage pieces
through as 'default'. Is it a realistic suggestion ? / if so, am happy
to knock up a patch.

[ and of course, this is only 1/2 the problem - the other half isn't
much helped by visibility markup as previously discussed ;-]

Thanks,

Michael.

-- 
 [EMAIL PROTECTED]  <><, Pseudo Engineer, itinerant idiot



Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Diego Novillo
On Mon, Aug 01, 2005 at 10:12:37PM -0700, Ian Lance Taylor wrote:
> Harald van D??k <[EMAIL PROTECTED]> writes:
> 
> > I finally managed to track down the problem I've been having to this
> > short code:
> > 
> > typedef struct {
> > unsigned car;
> > unsigned cdr;
> > } cons;
> > 
> > void nconc (unsigned x, unsigned y) {
> > unsigned *ptr = &x;
> > while(!(*ptr & 3))
> > ptr = &((cons *)(*ptr))->cdr;
> > *ptr = y;
> > }
> > 
> > With gcc 4.0-20050728 on i686-pc-linux-gnu, compiling this with -O2
> > appears to remove the assignment to *ptr. (I didn't prepare an example
> > program, but it's verifiable with objdump.) Obviously, this code is
> > non-portable, but still, I don't see why this can happen. Would anyone
> > be kind enough to explain this to me? It works as expected with -O2
> > -fno-strict-aliasing.
> 
> Well, I'd say it's a bug.  It works in 4.1.  The final assignment gets
> removed by tree-ssa-dce.c because it looks like a useless store.  This
> is because alias analysis thinks it knows what is going on, when it
> clearly does not.
> 
Are you sure?  I am not a language lawyer, but my understanding
is that you cannot legally make pointer 'p' point outside of
'x' using pointer arithmetic.  Since 'x' is a PARM_DECL passed by
value, the last assignment is a dead store.

In this case, 'ptr' should be marked as pointing anywhere.
However, alias analysis could also conclude that 'ptr' may not
point outside the current local frame.  So, the last store would
still be marked dead.

This distinction of different meanings for "points anywhere" will
be a feature of 4.2, most likely.

Having said that, I sent rth a 4.0 patch for a similar bug that
will "fix" this problem.  Richard, have you applied it yet?



* tree-ssa-alias.c (add_pointed_to_var): If VALUE is of
the form &(*PTR), take points-to information from PTR.

Index: tree-ssa-alias.c
===
RCS file: /cvs/gcc/gcc/gcc/tree-ssa-alias.c,v
retrieving revision 2.71.2.1
diff -d -u -p -r2.71.2.1 tree-ssa-alias.c
--- tree-ssa-alias.c26 Feb 2005 16:24:27 -  2.71.2.1
+++ tree-ssa-alias.c21 Jul 2005 20:13:44 -
@@ -1904,7 +1904,11 @@ add_pointed_to_var (struct alias_info *a
   if (REFERENCE_CLASS_P (pt_var))
 pt_var = get_base_address (pt_var);
 
-  if (pt_var && SSA_VAR_P (pt_var))
+  if (pt_var == NULL)
+{
+  pi->pt_anything = 1;
+}
+  else if (SSA_VAR_P (pt_var))
 {
   uid = var_ann (pt_var)->uid;
   bitmap_set_bit (ai->addresses_needed, uid);
@@ -1918,6 +1922,18 @@ add_pointed_to_var (struct alias_info *a
   if (is_global_var (pt_var))
pi->pt_global_mem = 1;
 }
+  else if (TREE_CODE (pt_var) == INDIRECT_REF
+   && TREE_CODE (TREE_OPERAND (pt_var, 0)) == SSA_NAME)
+{
+  /* If VALUE is of the form &(*P_j), then PTR will have the same
+points-to information as P_j.  */
+  add_pointed_to_expr (ai, ptr, TREE_OPERAND (pt_var, 0));
+}
+  else
+{
+  /* Give up.  PTR points anywhere.  */
+  set_pt_anything (ptr);
+}
 }


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Richard Guenther
On 8/2/05, Diego Novillo <[EMAIL PROTECTED]> wrote:
> On Mon, Aug 01, 2005 at 10:12:37PM -0700, Ian Lance Taylor wrote:
> > Harald van D??k <[EMAIL PROTECTED]> writes:
> >
> > > I finally managed to track down the problem I've been having to this
> > > short code:
> > >
> > > typedef struct {
> > > unsigned car;
> > > unsigned cdr;
> > > } cons;
> > >
> > > void nconc (unsigned x, unsigned y) {
> > > unsigned *ptr = &x;
> > > while(!(*ptr & 3))
> > > ptr = &((cons *)(*ptr))->cdr;
> > > *ptr = y;
> > > }
> > >
> > > With gcc 4.0-20050728 on i686-pc-linux-gnu, compiling this with -O2
> > > appears to remove the assignment to *ptr. (I didn't prepare an example
> > > program, but it's verifiable with objdump.) Obviously, this code is
> > > non-portable, but still, I don't see why this can happen. Would anyone
> > > be kind enough to explain this to me? It works as expected with -O2
> > > -fno-strict-aliasing.
> >
> > Well, I'd say it's a bug.  It works in 4.1.  The final assignment gets
> > removed by tree-ssa-dce.c because it looks like a useless store.  This
> > is because alias analysis thinks it knows what is going on, when it
> > clearly does not.
> >
> Are you sure?  I am not a language lawyer, but my understanding
> is that you cannot legally make pointer 'p' point outside of
> 'x' using pointer arithmetic.  Since 'x' is a PARM_DECL passed by
> value, the last assignment is a dead store.

p is not made to point 'outside' of x, but x is treated as a pointer, cast
to a struct pointer and then dereferenced.  Only if the loop entry condition
is false we end up storing into x (but only to x, not to memory beyond x),
and this store is of course dead.

Richard.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Richard Guenther
On 8/2/05, Richard Guenther <[EMAIL PROTECTED]> wrote:
> On 8/2/05, Diego Novillo <[EMAIL PROTECTED]> wrote:
> > On Mon, Aug 01, 2005 at 10:12:37PM -0700, Ian Lance Taylor wrote:
> > > Harald van D??k <[EMAIL PROTECTED]> writes:
> > >
> > > > I finally managed to track down the problem I've been having to this
> > > > short code:
> > > >
> > > > typedef struct {
> > > > unsigned car;
> > > > unsigned cdr;
> > > > } cons;
> > > >
> > > > void nconc (unsigned x, unsigned y) {
> > > > unsigned *ptr = &x;
> > > > while(!(*ptr & 3))
> > > > ptr = &((cons *)(*ptr))->cdr;
> > > > *ptr = y;
> > > > }
> > > >
> > > > With gcc 4.0-20050728 on i686-pc-linux-gnu, compiling this with -O2
> > > > appears to remove the assignment to *ptr. (I didn't prepare an example
> > > > program, but it's verifiable with objdump.) Obviously, this code is
> > > > non-portable, but still, I don't see why this can happen. Would anyone
> > > > be kind enough to explain this to me? It works as expected with -O2
> > > > -fno-strict-aliasing.
> > >
> > > Well, I'd say it's a bug.  It works in 4.1.  The final assignment gets
> > > removed by tree-ssa-dce.c because it looks like a useless store.  This
> > > is because alias analysis thinks it knows what is going on, when it
> > > clearly does not.
> > >
> > Are you sure?  I am not a language lawyer, but my understanding
> > is that you cannot legally make pointer 'p' point outside of
> > 'x' using pointer arithmetic.  Since 'x' is a PARM_DECL passed by
> > value, the last assignment is a dead store.
> 
> p is not made to point 'outside' of x, but x is treated as a pointer, cast
> to a struct pointer and then dereferenced.  Only if the loop entry condition
> is false we end up storing into x (but only to x, not to memory beyond x),
> and this store is of course dead.

Oh, and a workaround and slight correction would be to write

void nconc (unsigned x, unsigned y) {
 unsigned *ptr = &((cons *)x)->cdr;
 while(!(*ptr & 3))
 ptr = &((cons *)(*ptr))->cdr;
 *ptr = y;
 }

which makes aliasing see that the store is not dead and in fact it never will
be to the argument area.

Richard.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Diego Novillo
On Tue, Aug 02, 2005 at 02:56:50PM +0200, Richard Guenther wrote:
> Oh, and a workaround and slight correction would be to write
> 
> void nconc (unsigned x, unsigned y) {
>  unsigned *ptr = &((cons *)x)->cdr;
>  while(!(*ptr & 3))
>  ptr = &((cons *)(*ptr))->cdr;
>  *ptr = y;
>  }
> 
No.  Same problem.  The aliaser would say "yes, ptr points
anywhere, but it cannot escape the local frame".  The final store
is dead just the same.

We only "get it right" because we do not distinguish between
different degrees of points-anywhere.


bug in gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)

2005-08-02 Thread Mark Frazer
Hello.  I'm not on the list, so please CC me with any replies.

I have come across a bug found during some code which serializes
doubles.  The bug is only encountered when the optimization level is set
to -O2 or greater.

The bug is not encountered when compiled under
gcc (GCC) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)
at any optimization level.

The de-serialization code is:

typedef unsigned char uint8;
static uint8 next_byte(uint &offset, std::vector const &bytecode)
   throw (std::invalid_argument)
{
   if (offset >= bytecode.size())
  throw (std::invalid_argument("Unexpected end of bytecode"));
   return bytecode[offset++];
}

double parse_double(uint &offset, std::vector const &bytecode)
   throw (std::invalid_argument)
{
   typedef unsigned long long uint64;
   uint64 rtn = uint64(next_byte(offset, bytecode)) << 56;
   rtn |= uint64(next_byte(offset, bytecode)) << 48;
   rtn |= uint64(next_byte(offset, bytecode)) << 40;
   rtn |= uint64(next_byte(offset, bytecode)) << 32;
   rtn |= uint64(next_byte(offset, bytecode)) << 24;
   rtn |= uint64(next_byte(offset, bytecode)) << 16;
   rtn |= uint64(next_byte(offset, bytecode)) << 8;
   rtn |= uint64(next_byte(offset, bytecode));
   return *reinterpret_cast(&rtn);
}

Full source code to a demonstration of the bug, and a Makefile is at
http://mjfrazer.org/~mjfrazer/tmp/pack-test/
The tar file in the directory contains all the other files, so you just
need to grab that.

cheers
-mark


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Daniel Jacobowitz
On Tue, Aug 02, 2005 at 09:08:51AM -0400, Diego Novillo wrote:
> On Tue, Aug 02, 2005 at 02:56:50PM +0200, Richard Guenther wrote:
> > Oh, and a workaround and slight correction would be to write
> > 
> > void nconc (unsigned x, unsigned y) {
> >  unsigned *ptr = &((cons *)x)->cdr;
> >  while(!(*ptr & 3))
> >  ptr = &((cons *)(*ptr))->cdr;
> >  *ptr = y;
> >  }
> > 
> No.  Same problem.  The aliaser would say "yes, ptr points
> anywhere, but it cannot escape the local frame".  The final store
> is dead just the same.
> 
> We only "get it right" because we do not distinguish between
> different degrees of points-anywhere.

Then the alias analyzer's broken.  This isn't pointer arithmetic in the
sense that you mean.  It would be if the line were:

 ptr = &((cons *)(ptr))->cdr;

which is equivalent to some offset plus ptr.  But there's an extra
dereference:

 ptr = &((cons *)(*ptr))->cdr;
  ^

As far as I can tell, this code doesn't actually violate any of the
aliasing rules.  It just looks funny.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: bug in gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)

2005-08-02 Thread Richard Guenther
On 8/2/05, Mark Frazer <[EMAIL PROTECTED]> wrote:
> Hello.  I'm not on the list, so please CC me with any replies.
> 
> I have come across a bug found during some code which serializes
> doubles.  The bug is only encountered when the optimization level is set
> to -O2 or greater.
> 
> The bug is not encountered when compiled under
> gcc (GCC) 3.3.3 20040412 (Red Hat Linux 3.3.3-7)
> at any optimization level.
> 
> The de-serialization code is:
> 
> typedef unsigned char uint8;
> static uint8 next_byte(uint &offset, std::vector const &bytecode)
>throw (std::invalid_argument)
> {
>if (offset >= bytecode.size())
>   throw (std::invalid_argument("Unexpected end of bytecode"));
>return bytecode[offset++];
> }
> 
> double parse_double(uint &offset, std::vector const &bytecode)
>throw (std::invalid_argument)
> {
>typedef unsigned long long uint64;
>uint64 rtn = uint64(next_byte(offset, bytecode)) << 56;
>rtn |= uint64(next_byte(offset, bytecode)) << 48;
>rtn |= uint64(next_byte(offset, bytecode)) << 40;
>rtn |= uint64(next_byte(offset, bytecode)) << 32;
>rtn |= uint64(next_byte(offset, bytecode)) << 24;
>rtn |= uint64(next_byte(offset, bytecode)) << 16;
>rtn |= uint64(next_byte(offset, bytecode)) << 8;
>rtn |= uint64(next_byte(offset, bytecode));
>return *reinterpret_cast(&rtn);
> }
> 
> Full source code to a demonstration of the bug, and a Makefile is at
> http://mjfrazer.org/~mjfrazer/tmp/pack-test/
> The tar file in the directory contains all the other files, so you just
> need to grab that.

Try -fno-strict-aliasing.  This may be related to PR23192.

Richard.


Re: bug in gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)

2005-08-02 Thread Mark Frazer
Richard Guenther <[EMAIL PROTECTED]> [05/08/02 09:29]:
> Try -fno-strict-aliasing.  This may be related to PR23192.

-fno-strict-aliasing does indeed make the problem go away.

thanks!
-mark

-- 
Forget your stupid theme park! I'm gonna make my own! With hookers! And
blackjack! In fact, forget the theme park! - Bender


Re: bug in gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)

2005-08-02 Thread Mark Frazer
Mark Frazer <[EMAIL PROTECTED]> [05/08/02 09:18]:
> Hello.  I'm not on the list, so please CC me with any replies.
> 
> I have come across a bug found during some code which serializes
> doubles.  The bug is only encountered when the optimization level is set
> to -O2 or greater.

Oh, I forgot to mention that I'm running Fedora Core 4 on ia32.

[EMAIL PROTECTED] pack-test]$ gcc --version
gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[EMAIL PROTECTED] pack-test]$ uname -a
Linux pacific.mjfrazer.org 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 
i686 athlon i386 GNU/Linux

-mark
-- 
Forget your stupid theme park! I'm gonna make my own! With hookers! And
blackjack! In fact, forget the theme park! - Bender


Re: Large, modular C++ application performance ...

2005-08-02 Thread H. J. Lu
On Tue, Aug 02, 2005 at 10:59:01AM +0100, michael meeks wrote:
> Hi H.J.,
> 
> >  Why can't you you do it with ELF using a linker map? Libstdc++.so is
> > built with a linker map. Any C++ shared library should use one if the
> > startup time is a big concern. Of coursee, if gcc can generate a list
> > of symbols suitable for linker map, which needs to be exported, it will
> > be very helpful. I don't think it will be too hard to implement.
> 
>   So - the thing about linker maps (cf. the ldump4 tool) is that they
> tend to be hard to maintain, not portable across platforms, a source of
> grief and problems etc. ;-) [ we have several strata of old, now defunct
> link maps lying around from previous investments of effort that
> subsequently became useless ].

Maitaining a C++ linker map isn't easy. I think gcc should help out
here.

> 
>   As I recall, I saw a suggestion (from you I think), for a new
> visibility attribute 'export' or somesuch, that would resolve names
> internally to the library, while still exporting the symbols.

I sugggested the "export" visibility to export a symbol from an
executable, even if it wasn't used by any DSOs.

> 
>   That would suit our needs beautifully - if, when used to annotate a
> class, it would allow the various typeinfo / vague-linkage pieces
> through as 'default'. Is it a realistic suggestion ? / if so, am happy
> to knock up a patch.
> 
>   [ and of course, this is only 1/2 the problem - the other half isn't
> much helped by visibility markup as previously discussed ;-]
> 

Why not? If you know a symbol in DSO won't be overridden by others,
you can resolve it locally via a linker map.



H.J.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Diego Novillo
On Tue, Aug 02, 2005 at 09:39:56AM -0400, Daniel Jacobowitz wrote:

> Then the alias analyzer's broken.
>
Broken?  I'm saying that we currently get this right.  I don't
know what position are you arguing.

> This isn't pointer arithmetic in the sense that you mean.  It
> would be if the line were:
> 
>  ptr = &((cons *)(ptr))->cdr;
> 
Yes, I realize this now.  And that is not my point.  

> which is equivalent to some offset plus ptr.  But there's an extra
> dereference:
> 
>  ptr = &((cons *)(*ptr))->cdr;
>   ^
> 
This code does builds an address location out of an arbitrary integer:

  unsigned int D.1142_8 = *ptr_1;
  struct cons *D.1143_9 = (struct cons *) D.1142_8;
  ptr_10 = &D.1143_9->cdr;

Does the language allow the creation of address locations out of
arbitrary integer values?  Is the dereference of such an
address a defined operation?  If so, then it's simply a matter of
recognizing this situation when computing points-anywhere
attributes.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Daniel Jacobowitz
On Tue, Aug 02, 2005 at 09:57:39AM -0400, Diego Novillo wrote:
> On Tue, Aug 02, 2005 at 09:39:56AM -0400, Daniel Jacobowitz wrote:
> 
> > Then the alias analyzer's broken.
> >
> Broken?  I'm saying that we currently get this right.  I don't
> know what position are you arguing.

Sorry, my mistake.  I'd forgotten that Ian said we got this right in
4.1.

> This code does builds an address location out of an arbitrary integer:
> 
>   unsigned int D.1142_8 = *ptr_1;
>   struct cons *D.1143_9 = (struct cons *) D.1142_8;
>   ptr_10 = &D.1143_9->cdr;
> 
> Does the language allow the creation of address locations out of
> arbitrary integer values?  Is the dereference of such an
> address a defined operation?  If so, then it's simply a matter of
> recognizing this situation when computing points-anywhere
> attributes.

Yes, it does - well, it's implementation defined, but GCC has long
chosen the natural interpretation.  C99 6.3.2.3, paragraph 5.  This is
no different from that classic example, a pointer which escapes via
printf/scanf.

-- 
Daniel Jacobowitz
CodeSourcery, LLC


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Diego Novillo
On Tue, Aug 02, 2005 at 10:05:37AM -0400, Daniel Jacobowitz wrote:

> Yes, it does - well, it's implementation defined, but GCC has long
> chosen the natural interpretation.  C99 6.3.2.3, paragraph 5.  This is
> no different from that classic example, a pointer which escapes via
> printf/scanf.
> 
OK, thanks.  That settles it then.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Andreas Schwab
Diego Novillo <[EMAIL PROTECTED]> writes:

> Does the language allow the creation of address locations out of
> arbitrary integer values?

Yes.

6.3.2.3 Pointers

5 An integer may be converted to any pointer type. [...]

>  Is the dereference of such an address a defined operation?

It is implemetation-defined.

[...] Except as previously specified, the result is
implementation-defined, might not be correctly aligned, might not
point to an entity of the referenced type, and might be a trap
representation.

Also, the integer may have been the result of casting a valid pointer, in
which case the operation is fully defined (assuming the integer is wide
enough).

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: bug in gcc (GCC) 4.0.1 20050727 (Red Hat 4.0.1-5)

2005-08-02 Thread Mark Frazer
Mark Frazer <[EMAIL PROTECTED]> [05/08/02 09:32]:
> Richard Guenther <[EMAIL PROTECTED]> [05/08/02 09:29]:
> > Try -fno-strict-aliasing.  This may be related to PR23192.
> 
> -fno-strict-aliasing does indeed make the problem go away.

changing the de-serialization function to:

  double parse_double(uint &offset, vector const &bytecode)
 throw (std::invalid_argument)
  {
 union {
uint64 ival;
double dval;
 } rtn;
 rtn.ival  = uint64(next_byte(offset, bytecode)) << 56;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 48;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 40;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 32;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 24;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 16;
 rtn.ival |= uint64(next_byte(offset, bytecode)) << 8;
 rtn.ival |= uint64(next_byte(offset, bytecode));
 return rtn.dval;
  }

Allows for the strict-aliasing optimization to be left in.  So, it seems
the bug was mine, not gcc's.

I'm off to search for other reinterpret_cast abuses in my code...

cheers
-mark
-- 
To Captain Bender! He's the best! ...at being a big jerk who's stupid and
his big ugly face is as dumb as a butt! - Fry


Re: Large, modular C++ application performance ...

2005-08-02 Thread michael meeks

On Tue, 2005-08-02 at 06:57 -0700, H. J. Lu wrote:
> Maitaining a C++ linker map isn't easy. I think gcc should help out
> here.

What do you suggest ? - something separate from the visibility markup ?
perhaps what I'm suggesting is some horribly mis-use of that. Clearly
adding a new visibility attribute that would bind that symbol
internally, yet export it would be a simple approach; did you have a
better idea ? and/or suggestions for a name ? - or is this a total
non-starter for some other reason ?

> > That would suit our needs beautifully - if, when used to annotate a
> > class, it would allow the various typeinfo / vague-linkage pieces
> > through as 'default'. Is it a realistic suggestion ? / if so, am happy
> > to knock up a patch.
> > 
> > [ and of course, this is only 1/2 the problem - the other half isn't
> > much helped by visibility markup as previously discussed ;-]
>
> Why not? If you know a symbol in DSO won't be overridden by others,
> you can resolve it locally via a linker map.

Sure - the other (more than) 1/2 of the performance problem comes from
named relocations to symbols external to the DSO.

Thanks,

Michael.

-- 
 [EMAIL PROTECTED]  <><, Pseudo Engineer, itinerant idiot



RE: rfa (x86): 387<=>sse moves

2005-08-02 Thread Linthicum, Tony
Hello All,

I applied the recent patches to the 7/23 snapshot, and am still seeing
some 387 to sse moves.  In particular, in SpecFP's 177.mesa (matrix.c),
I'm seeing fld1's feeding moves to sse registers.  

Compiled via: gcc -O3 -march=k8 -mfpmath=sse matrix.c

Thanks.

Tony


-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of
Dale Johannesen
Sent: Monday, August 01, 2005 1:53 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; gcc@gcc.gnu.org
Subject: Re: rfa (x86): 387<=>sse moves


On Jul 31, 2005, at 9:51 AM, Uros Bizjak wrote:

> Hello!
>
>> With -march=pentium4 -mfpmath=sse -O2, we get an extra move for code 
>> like
>>
>>double d = atof(foo);
>>int i = d;
>>
>>
>>callatof
>>fstpl   -8(%ebp)
>>movsd   -8(%ebp), %xmm0
>>cvttsd2si   %xmm0, %eax
>>
>>
>> (This is Linux, Darwin is similar.) I think the difficulty is that
for
>
> This problem is similar to the problem, described in PR target/19398. 
> There is another testcase and a small analysis in the PR that might 
> help with this problem.

Thanks, that does seem relevant.  The patches so far don't fix this 
case;
I've commented the PR explaining why.





Re: #pragma support to guide autovectorizer

2005-08-02 Thread Devang Patel


On Aug 2, 2005, at 12:10 AM, Dorit Naishlos wrote:


I was wondering if any addition work had been completed toward pragma
support for the autovectorization branch (see
http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01560.html)?


I think Devang was planning to continue this work - I'm not sure  
where it

stands


I made some progress and it is still on my list, but right now I'm
fighting fire-fight on debugging issues. Developers are making
lots of noise on this front, once they switch to gcc-4.0.

This work has missed 4.1 train.

-
Devang



Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Richard Henderson
On Tue, Aug 02, 2005 at 08:32:53AM -0400, Diego Novillo wrote:
> Having said that, I sent rth a 4.0 patch for a similar bug that
> will "fix" this problem.  Richard, have you applied it yet?

No, I forgot about it.


r~


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Diego Novillo
On Tue, Aug 02, 2005 at 10:05:53AM -0700, Richard Henderson wrote:
> On Tue, Aug 02, 2005 at 08:32:53AM -0400, Diego Novillo wrote:
> > Having said that, I sent rth a 4.0 patch for a similar bug that
> > will "fix" this problem.  Richard, have you applied it yet?
> 
> No, I forgot about it.
> 
That's fine.  Just applied it.


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Ian Lance Taylor
Diego Novillo <[EMAIL PROTECTED]> writes:

> On Tue, Aug 02, 2005 at 10:05:37AM -0400, Daniel Jacobowitz wrote:
> 
> > Yes, it does - well, it's implementation defined, but GCC has long
> > chosen the natural interpretation.  C99 6.3.2.3, paragraph 5.  This is
> > no different from that classic example, a pointer which escapes via
> > printf/scanf.
> > 
> OK, thanks.  That settles it then.

Just to close out this thread for the record, Andrew Pinski opened PR
23912 for this problem, and Diego checked in a patch for the 4.0
branch.  So all should be well in 4.0.2.

Ian


memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
In a typical Ethernet/IP ARP header the source IP address is
unaligned. Instead of using...
out->srcIPAddr = in->dstIPAddr;
... I used...
memcpy(&out->srcIPAddr, &in->dstIPAddr, sizeof(uint32_t));
... to account for the unaligned destination. This worked until gcc 4,
which now generates a simple load/store.
ldr r3, [r6, #24]
addsr2, r4, #0
addsr2, #14
str r3, [r2, #0]
A nice optimisation, but in this case it's incorrect. $r4 is aligned,
and the result of adding #14 to $r4 is an unaligned pointer.

Should gcc know better, or do I need to give it a little more
information to help it out?

Please cc me in your reply. Cheers,
Shaun


RE: memcpy to an unaligned address

2005-08-02 Thread Dave Korn
Original Message
>From: Shaun Jackman
>Sent: 02 August 2005 18:33

> In a typical Ethernet/IP ARP header the source IP address is
> unaligned. Instead of using...
>   out->srcIPAddr = in->dstIPAddr;
> ... I used...
>   memcpy(&out->srcIPAddr, &in->dstIPAddr, sizeof(uint32_t));
> ... to account for the unaligned destination. This worked until gcc 4,
> which now generates a simple load/store.
>   ldr r3, [r6, #24]
>   addsr2, r4, #0
>   addsr2, #14
>   str r3, [r2, #0]
> A nice optimisation, but in this case it's incorrect. $r4 is aligned,
> and the result of adding #14 to $r4 is an unaligned pointer.
> 
> Should gcc know better, or do I need to give it a little more
> information to help it out?

  In order for anyone to answer your questions about the alignment of
various types in a struct, don't you think you should perhaps have told us a
little about what those types actually are and how the struct is laid out?
[*]


cheers,
  DaveK

[*] - See debugging, psychic ;)
-- 
Can't think of a witty .sigline today



Re: does the instruction combiner regards (foo & 0xff) as a special case?

2005-08-02 Thread James E Wilson

[EMAIL PROTECTED] wrote:

I guess the combiner generates something like
a trucation pattern when special constant are detected.
The combiner also takse a similiar action in pattern


See the section of the documentation that talks about instruction 
canonicalization.

http://gcc.gnu.org/onlinedocs/gccint/Insn-Canonicalizations.html
See in particular the last bullet.

Also, as Joern mentioned, you should try stepping through try_combine to 
see what is really happening.

--
Jim Wilson, GNU Tools Support, http://www.specifix.com


Re: memcpy to an unaligned address

2005-08-02 Thread Falk Hueffner
Shaun Jackman <[EMAIL PROTECTED]> writes:

> In a typical Ethernet/IP ARP header the source IP address is
> unaligned. Instead of using...
>   out->srcIPAddr = in->dstIPAddr;
> ... I used...
>   memcpy(&out->srcIPAddr, &in->dstIPAddr, sizeof(uint32_t));
> ... to account for the unaligned destination. This worked until gcc 4,
> which now generates a simple load/store.
>   ldr r3, [r6, #24]
>   addsr2, r4, #0
>   addsr2, #14
>   str r3, [r2, #0]
> A nice optimisation, but in this case it's incorrect. $r4 is aligned,
> and the result of adding #14 to $r4 is an unaligned pointer.

It isn't incorrect; gcc can assume that pointers are always correctly
aligned for their type. Anything else would result in horrible code.
If your program forms a pointer that is not properly aligned, it is
already invalid, and later breakage is only a symptom of that.

-- 
Falk


RE: splitting load immediates using high and lo_sum

2005-08-02 Thread Tabony, Charles
> From: Dale Johannesen [mailto:[EMAIL PROTECTED] 
> 
> On Jul 21, 2005, at 5:04 PM, Tabony, Charles wrote:
> 
> >> From: Dale Johannesen [mailto:[EMAIL PROTECTED]
> >>
> >> On Jul 21, 2005, at 4:36 PM, Tabony, Charles wrote:
> >>
> >>> Hi,
> >>>
> >>> I am working on a port for a processor that has 32 bit 
> registers but
> >>> can
> >>> only load 16 bit immediates.
> >>>   ""
> >>>   "%0.h = #HI(%1)")
> >>
> >> What are the semantics of this?  Low bits zeroed, or untouched?
> >> If the former, your semantics are identical to Sparc; look at that.
> >
> > The low bits are untouched.  However, I would expect the compiler to
> > always follow setting the high bits with setting the low bits.
> 
> OK, if you're willing to accept that limitation (your 
> architecture could
> handle putting the LO first, which Sparc can't) then Sparc is still a
> good model to look at.  What it does should work for you.

Earlier I was able to successfully split load immediates into high and
lo_sum insns, and that has worked great as far as scheduling.  However,
I noticed that now instead of loading the address of a constant such as
a string, compiled programs will load the address of a constant that is
the address of that string and then dereference it.  My guess is that
this is caused by the constant in the high/lo_sum pair being hidden from
CSE.

I looked at the way SPARC and MIPS handle the problem, but I don't think
that will work for me.  If I understand correctly, they split the move
into a load immediate that has the lower bits cleared, corresponding to
a sethi or lui instruction, and an ior immediate.  The semantics of the
instructions I am working with, "R0.H = #HI(CONSTANT)" and "R0.L =
#LO(CONSTANT)" are that the half of the register not being set is
unmodified.

Since I can not use an ior immediate like SPARC and MIPS, how can I
split move immediate insns so that they can be effeciently scheduled but
still eliminate the unnecessary indirection?  Also, does the method used
by SPARC and MIPS work for symbols?

Thank you,
Charles


Re: memcpy to an unaligned address

2005-08-02 Thread Mike Stump

On Aug 2, 2005, at 10:32 AM, Shaun Jackman wrote:

In a typical Ethernet/IP ARP header the source IP address is
unaligned. Instead of using...
out->srcIPAddr = in->dstIPAddr;
... I used...
memcpy(&out->srcIPAddr, &in->dstIPAddr, sizeof(uint32_t));
... to account for the unaligned destination. This worked until gcc 4,
which now generates a simple load/store.
ldr r3, [r6, #24]
addsr2, r4, #0
addsr2, #14
str r3, [r2, #0]
A nice optimisation, but in this case it's incorrect. $r4 is aligned,
and the result of adding #14 to $r4 is an unaligned pointer.

Should gcc know better, or do I need to give it a little more
information to help it out?


gcc-help is the correct list to ask this question on.  Anyway, I  
suspect people would be aided in helping you by seeing the source  
code and knowing what version of gcc you're using...  I suspect you  
don't mark the structure as packed and as using 1 or 2 byte  
alignment.  If you do that, then the compiler should generate the  
correct code, for example:


mrs $ cat t1.c
struct {
  char a[14];
  int i __attribute__((aligned(1), packed));
} s, d;

main() {
  d.i = s.i;
}


$ arm-gcc -O4 t1.c -S

gives:

_main:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
str sl, [sp, #-4]!
ldr sl, .L3
ldr r2, .L3+4
.L2:
add sl, pc, sl
ldr ip, [sl, r2]
ldr r0, .L3+8
ldrhr1, [ip, #16]
ldr r2, [sl, r0]
ldrhr3, [ip, #14]
@ lr needed for prologue
strhr1, [r2, #16]   @ movhi
strhr3, [r2, #14]   @ movhi
ldmfd   sp!, {sl}
mov pc, lr

for me.  Notice the adding of 14, notice the two 16 bit moves instead  
of one 4 byte move.  If you lie to the compiler, it will make your  
life rough.  Telling it that it is aligned, when the data isn't  
aligned, is a lie.




Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Dave Korn <[EMAIL PROTECTED]> wrote:
>   In order for anyone to answer your questions about the alignment of
> various types in a struct, don't you think you should perhaps have told us a
> little about what those types actually are and how the struct is laid out?

Of course, my apologies. I was clearly overly terse. I declare the
structure packed as follows:

typedef struct {
uint16_t a;
uint32_t b;
} __attribute__((packed)) st;

void foo(st *s, int n)
{
memcpy(&s->b, &n, sizeof n);
}

This code generates the unaligend store:
$ arm-elf-objdump -d packed.o
...
   0:   e24dd004sub sp, sp, #4  ; 0x4
   4:   e5801002str r1, [r0, #2]
   8:   e28dd004add sp, sp, #4  ; 0x4
   c:   e12fff1ebx  lr
$ arm-elf-gcc --version | head -1
arm-elf-gcc (GCC) 4.0.1

Cheers,
Shaun


Re: memcpy to an unaligned address

2005-08-02 Thread Paul Koning
One of the things that continues to baffle me (and my colleagues) is
the bizarre way in which attributes such as "packed" work when applied
to structs.

It would be natural to assume, as Shaun did, that marking a struct
"packed" (or, for that matter, "packed,aligned(2)") would apply that
attribute to the fields of the struct.

But it doesn't work that way.  To get the right results, you have to
stick attributes all over the structure fields, one by one.  This is
highly counterintuive.

Worse yet, in this example the attribute is applied to the structure
elements to some extent but not consistently -- it causes the fields
to be packed -- hence unaligned -- but it does not do unaligned
accesses to the fields.

This sure looks like a bug.

 paul



Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Paul Koning <[EMAIL PROTECTED]> wrote:
> One of the things that continues to baffle me (and my colleagues) is
> the bizarre way in which attributes such as "packed" work when applied
> to structs.
> 
> It would be natural to assume, as Shaun did, that marking a struct
> "packed" (or, for that matter, "packed,aligned(2)") would apply that
> attribute to the fields of the struct.

This is exactly the behaviour suggested by the info docs:

$ info gcc 'C Ext' 'Type Attr'
...
 Specifying this attribute for `struct' and `union' types is
 equivalent to specifying the `packed' attribute on each of the
 structure or union members.

Cheers,
Shaun


Re: More fun with aliasing - removing assignments?

2005-08-02 Thread Daniel Berlin
> > OK, thanks.  That settles it then.
> 
> Just to close out this thread for the record, Andrew Pinski opened PR
> 23912 for this problem, and Diego checked in a patch for the 4.0
> br
> n
> h.  So all should be well in 4.0.2.
> 

And the alias analyzer for 4.1 has tihs code, which is why it comes up
with the right answer:

  case NOP_EXPR:
  case CONVERT_EXPR:
  case NON_LVALUE_EXPR:
{
  tree op = TREE_OPERAND (t, 0);

  /* Cast from non-pointer to pointers are bad news for us.
 Anything else, we see through */
  if (!(POINTER_TYPE_P (TREE_TYPE (t))
&& ! POINTER_TYPE_P (TREE_TYPE (op
return get_constraint_for (op);

  /* FALLTHRU  */

}
  default:
{
  temp.type = ADDRESSOF;
  temp.var = anything_id;
  temp.offset = 0;
  return temp;
}


We special case casts from integer constants like 0 (somewhere else) :)


I decided it wasn't worth trying to change years of practice of "let's
cast integers to pointers" by trying to sneak this in.
I'd rathre just watch as all their code explodes for other reasons, like
trying to cast pointers to unsigned int's on a 64 bit machine with LP64
models.




RE: memcpy to an unaligned address

2005-08-02 Thread Dave Korn
Original Message
>From: Shaun Jackman
>Sent: 02 August 2005 20:26

> On 8/2/05, Paul Koning <[EMAIL PROTECTED]> wrote:
>> One of the things that continues to baffle me (and my colleagues) is
>> the bizarre way in which attributes such as "packed" work when applied
>> to structs. 
>> 
>> It would be natural to assume, as Shaun did, that marking a struct
>> "packed" (or, for that matter, "packed,aligned(2)") would apply that
>> attribute to the fields of the struct.
> 
> This is exactly the behaviour suggested by the info docs:
> 
> $ info gcc 'C Ext' 'Type Attr'
> ...
>  Specifying this attribute for `struct' and `union' types is
>  equivalent to specifying the `packed' attribute on each of the
>  structure or union members.
> 


  There are two separate issues here:

1)  Is the base of the struct aligned to the natural alignment, or can the
struct be based at any address

2)  Is there padding between the struct members to maintain their natural
alignments (on the assumption that the struct's base address is aligned.)

  I think this is where some of the ambiguity in the docs comes from.  But
I'm about to leave the office now, so I can't go into depth with this thread
right now


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



RE: memcpy to an unaligned address

2005-08-02 Thread Paul Koning
> "Dave" == Dave Korn <[EMAIL PROTECTED]> writes:

 Dave> Original Message
 >> From: Shaun Jackman Sent: 02 August 2005 20:26

 >> On 8/2/05, Paul Koning <[EMAIL PROTECTED]> wrote:
 >>> One of the things that continues to baffle me (and my colleagues)
 >>> is the bizarre way in which attributes such as "packed" work when
 >>> applied to structs.
 >>> 
 >>> It would be natural to assume, as Shaun did, that marking a
 >>> struct "packed" (or, for that matter, "packed,aligned(2)") would
 >>> apply that attribute to the fields of the struct.
 >>  This is exactly the behaviour suggested by the info docs:
 >> 
 >> $ info gcc 'C Ext' 'Type Attr' ...  Specifying this attribute for
 >> `struct' and `union' types is equivalent to specifying the
 >> `packed' attribute on each of the structure or union members.
 >> 


 Dave> There are two separate issues here:

 Dave> 1) Is the base of the struct aligned to the natural alignment,
 Dave> or can the struct be based at any address

 Dave> 2) Is there padding between the struct members to maintain
 Dave> their natural alignments (on the assumption that the struct's
 Dave> base address is aligned.)

Sure.  But in Shaun's case it looks like (2) has been applied, except
that the compiler doesn't adjust the generated code correctly.  I
would argue that "packed" applied to a whole struct should produce
BOTH effects 1 and 2.

There's a third case for which there appears to be no notation:

3) A pointer to a T that doesn't have the normal alignment of 
   the type T.

For example, as far as I can tell, GCC offers no way to say "pointer
to unaligned int" -- short of creating a one-member struct.

   paul



GCC-3.4.5 status report

2005-08-02 Thread Gabriel Dos Reis

Hi,

  The number of open PRs registered as CC-3.4.x regressions only and
targetted for 3.4.5 has decreased from 125 (last week) to 115.  Which
is a progress!  Still, we have too many PRs  for a stable branch.

Here is the complete list as communicated to me by the bugzilla mail
interface.  Note to Dan: we still miss ways to query for PRs closed
in specific laps to time.  The C++ front-end remains the winner in the
category of maximum number of regressions (46).

bootstrap: 2
  18532 libgcc.mk isn't parallel build safe for multilib
  22213 quoting of dir-variable in mklibgcc.in

c: 5
  16676 ICE with nested functions and -g1, blocks glibc
  20239 ICE on empty preprocessed input
  21536 C99 array of variable length use causes segmentation fault
  22061 internal compiler error: in find_function_data, at function.c:317
  22458 ICE on missing brace

c++: 46
  11224 warning "value computed is not used" no longer emitted
  14500 most specialized function template vs. non-template function
  14950 always_inline does not mix with templates and -O0
  16021 Tests for container swap specialisations FAIL in debug mode
  16030 stdcall function decoration vs LTHUNK alias in multiple inheritanc
  16042 ICE with array assignment
  16276 G++ generates local references to linkonce sections
  16405 Temporary aggregate copy not elided
  16572 Wrong filename/line number were reported by g++ in inlining's warning 
messages
  17248 __always_inline__ throws "unimplemented" in -O0 mode
  17332 Missed inline opportunity
  17609 spurious error message after using keyword
  17655 ICE with using a C99 initializer in an if-condition
  17972 const/pure functions result in bad asm
  18273 Fail to generate debug info for member function.
  18368 C++ error message regression
  18445 ice during overload resolution in template instantiation
  18462 Segfault on declaration of large array member
  18466 int ::i; accepted
  18512 ICE on invalid usage of template base class
  18514 Alternate "asm" name ignored for redeclared builtin function imported 
into namespace std
  18545 ICE when returning undefined type
  18625 triple error message for invalid typedef
  18738 typename not allowed with non-dependent qualified name
  19043 -fpermissive gives bad loop initializations
  19063 ICE on invalid template parameter
  19395 invalid scope qualifier allowed in typedef
  19396 Invalid template in typedef accepted
  19397 ICE with invalid typedef
  19441 Bad error message with invalid destructor declaration
  19628 g++ no longer accepts __builtin_constant_p in constant-expressions
  19710 ice on invalid one line C++ code
  19734 Another ICE on invalid destructor call
  19762 ICE in invalid explicit instantiation of a destructor
  19764 ICE on explicit instantiation of a non-template destructor
  19982 The left side of the "=" operator must be an lvalue.
  20152 ICE compiling krusader-1.5.1 with latest CVS gcc
  20153 ICE when C++ template function contains anonymous union
  20383 #line directive breaks try-catch statement
  20427 ()' not default initialized
  20552 ICE in write_type, at cp/mangle.c:1579
  20905 confuses unrelated type name with instance name
  21784 Using vs builtin names
  22215 g++ -O2 generates Undefined Global for statically defined function
  22545 ICE with pointer to class member & user defined conversion operator
  23162 internal compiler error: in c_expand_expr, at c-common.c:4138

debug: 4
  16035 internal compiler error: in gen_subprogram_die, at dwarf2out.c:10798
  17076 ICE on variable size array initialization in debug mode in C++
  20253 Macro debug info broken due to lexer change
  21932 -O3 -fno-unit-at-a-time causes ICE

fortran: 2
  18913 seg. fault with -finit-local-zero option on complex array of dimension 1
  20774 Debug information in .o (from FORTRAN) points to temporary file under 
certain circumstances

libf2c: 1
  17725 g77 libs installed in wrong directory

libobjc: 1
  11572 GNU libobjc no longer compiled on Darwin

libstdc++: 1
  11953 _REENTRANT defined when compiling non-threaded code.

middle-end: 6
  18956 'bus error' at runtime while passing a special struct to a C++ member 
function
  19183 ICE with -fPIC
  19371 Missing uninitialized warning with dead code (pure/const functions)
  20329 current 3.4.4 miscompiles Linux kernel with athlon optimisations
  21964 broken tail call at -O2 or more
  22177 error: in assign_stack_temp_for_type, at function.c:655

other: 4
  15378 -Werror should provide notification of why gcc is exiting
  17594 GCC does not error about unknown options which starts with a valid 
option
  20731 contrib/gcc_update hard code -r gcc-3_4-branch
  22511 cc1plus: error: unrecognized command line option "-Wno-pointer-sign"

preprocessor: 2
  15307 Preprocessor ICE on invalid input
  19475 missing whitespace after macro name in C90 or C++

rtl-optimization: 20
  11707 constants not propagated in unrolled loop iterations with a conditional
  12863 basic block reordering fails for fallth

Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Dave Korn <[EMAIL PROTECTED]> wrote:
>   There are two separate issues here:
> 
> 1)  Is the base of the struct aligned to the natural alignment, or can the
> struct be based at any address

The base of the struct is aligned to the natural alignment, four bytes
in this case.
 
> 2)  Is there padding between the struct members to maintain their natural
> alignments (on the assumption that the struct's base address is aligned.)

There is no padding. The structure is defined as
__attribute__((packed)) to explicitly remove the padding. The result
is that gcc knows the unaligned four byte member is at an offset of
two bytes from the base of the struct, but uses a four byte load at
the unaligned address of base+2. I don't expect...
p->unaligned = n;
... to work, but I definitely expect
memcpy(&p->unaligned, &n, sizeof p->unaligned);
to work. The second case is being optimised to the first case though
and generating and unaligned store.

Cheers,
Shaun


Re: memcpy to an unaligned address

2005-08-02 Thread Paul Koning
> "Shaun" == Shaun Jackman <[EMAIL PROTECTED]> writes:

 >> 2) Is there padding between the struct members to maintain their
 >> natural alignments (on the assumption that the struct's base
 >> address is aligned.)

 Shaun> There is no padding. The structure is defined as
 Shaun> __attribute__((packed)) to explicitly remove the padding. The
 Shaun> result is that gcc knows the unaligned four byte member is at
 Shaun> an offset of two bytes from the base of the struct, but uses a
 Shaun> four byte load at the unaligned address of base+2. I don't
 Shaun> expect...
 Shaun>  p-> unaligned = n;
 Shaun> ... to work, ...

I would.  If you tell gcc that a thing is unaligned, it is responsible
for doing unaligned references to it.  That very definitely includes
direct references to the content in expressions.  And in general that
works.  Clearly there is a GCC bug here; GCC put the field at an
unaligned offset, but did not do unaligned references to it.

  paul



Re: memcpy to an unaligned address

2005-08-02 Thread Mike Stump

On Aug 2, 2005, at 1:15 PM, Shaun Jackman wrote:

There is no padding. The structure is defined as
__attribute__((packed)) to explicitly remove the padding. The result
is that gcc knows the unaligned four byte member is at an offset of
two bytes from the base of the struct, but uses a four byte load at
the unaligned address of base+2. I don't expect...
p->unaligned = n;
... to work,


Actually, that works just fine, with:

typedef struct {
  unsigned short int a;
  unsigned int b;
} __attribute__((packed)) st;

void foo(st *s, int n)
{
  s->b = n;
}

I get:

_foo:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
mov r3, r1, lsr #24
mov r2, r1, lsr #8
mov ip, r1, lsr #16
@ lr needed for prologue
strbr3, [r0, #5]
strbr2, [r0, #3]
strbip, [r0, #4]
strbr1, [r0, #2]
mov pc, lr


but I definitely expect
memcpy(&p->unaligned, &n, sizeof p->unaligned);
to work.


Ah, I was having trouble getting it to fail for me...  Now I can:

#include 

typedef struct {
  unsigned short int a;
  unsigned int b;
} __attribute__((packed)) st;

void foo(st *s, int n)
{
  memcpy(&s->b, &n, sizeof n);
}

_foo:
@ args = 0, pretend = 0, frame = 4
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
sub sp, sp, #4
@ lr needed for prologue
str r1, [r0, #2]
add sp, sp, #4
bx  lr

Yes, this is a compiler bug in the expansion of memcpy, please file a  
bug report.  The solution is for the compiler to notice the memory  
alignment of the destination and `do-the-right-thing' when it isn't  
aligned.


Re: memcpy to an unaligned address

2005-08-02 Thread Andrew Pinski
> 
> On Aug 2, 2005, at 1:15 PM, Shaun Jackman wrote:
> > There is no padding. The structure is defined as
> > __attribute__((packed)) to explicitly remove the padding. The result
> > is that gcc knows the unaligned four byte member is at an offset of
> > two bytes from the base of the struct, but uses a four byte load at
> > the unaligned address of base+2. I don't expect...
> > p->unaligned = n;
> > ... to work,
> 
> Actually, that works just fine, with:
> 
> typedef struct {
>unsigned short int a;
>unsigned int b;
> } __attribute__((packed)) st;
> 
> void foo(st *s, int n)
> {
>s->b = n;
> }
> 
> Ah, I was having trouble getting it to fail for me...  Now I can:
> 
> #include 
> 
> typedef struct {
>unsigned short int a;
>unsigned int b;
> } __attribute__((packed)) st;
> 
> void foo(st *s, int n)
> {
>memcpy(&s->b, &n, sizeof n);
> }
> 
> Yes, this is a compiler bug in the expansion of memcpy, please file a  
> bug report.  The solution is for the compiler to notice the memory  
> alignment of the destination and `do-the-right-thing' when it isn't  
> aligned.

No it is not, once you take the address (which should be rejected), it
is of type "unsigned int *" and not unaligned variable, passing it to
memcpy assumes the type alignment is the natural alignment.

-- Pinski



Re: memcpy to an unaligned address

2005-08-02 Thread Ian Lance Taylor
Andrew Pinski <[EMAIL PROTECTED]> writes:

> > Yes, this is a compiler bug in the expansion of memcpy, please file a  
> > bug report.  The solution is for the compiler to notice the memory  
> > alignment of the destination and `do-the-right-thing' when it isn't  
> > aligned.
> 
> No it is not, once you take the address (which should be rejected), it
> is of type "unsigned int *" and not unaligned variable, passing it to
> memcpy assumes the type alignment is the natural alignment.

That argument doesn't make sense to me.  memcpy takes a void*
argument, which has no presumed alignment.  The builtin should work
the same way.  That is, there is an implicit cast to void* in the
argument to memcpy.  The compiler can certainly take advantage of any
knowledge it has about the alignment, but it can't assume anything
about the alignment that it doesn't already know.

Ian


Re: memcpy to an unaligned address

2005-08-02 Thread Paul Koning
> "Andrew" == Andrew Pinski <[EMAIL PROTECTED]> writes:

 >> Yes, this is a compiler bug in the expansion of memcpy, please
 >> file a bug report.  The solution is for the compiler to notice the
 >> memory alignment of the destination and `do-the-right-thing' when
 >> it isn't aligned.

 Andrew> No it is not, once you take the address (which should be
 Andrew> rejected), it is of type "unsigned int *" and not unaligned
 Andrew> variable, passing it to memcpy assumes the type alignment is
 Andrew> the natural alignment.

That seems like a misfeature.

It sounds like the workaround is to avoid memcpy, and just use
variable assignment.  Alternatively, cast the pointers to char*, which
should force memcpy to do the right thing.  Ugh.

   paul



GCC 4.2 Projects

2005-08-02 Thread Mark Mitchell
Although we're still in Stage 3, it's time to start thinking about GCC 
4.2.  I know that many people are working on projects that they hope to 
include in GCC 4.2, and it's reasonable to start gathering them.  I 
don't plan to actually work on ordering them in any coherent way for a 
few more weeks, and I want to keep the focus on fixing bugs in Stage 3, 
but I also don't want to be obstructing forward progress for GCC 4.2.


In keeping with the clear preference from 4.1, we'll do all project 
proposals as publicly posted Wiki pages on the GCC Wiki.  I'll poll the 
Wiki for new projects, but I think people might appreciate mail to the 
GCC mailing list when you add something.


See:

  http://gcc.gnu.org/wiki/GCC%204.2%20Projects

for some guidelines.

Thanks,

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: memcpy to an unaligned address

2005-08-02 Thread Mike Stump

On Aug 2, 2005, at 1:37 PM, Andrew Pinski wrote:

No it is not,


:-)  Ah, yes, the old, we don't have pointers to unaligned types  
problem...  anyway, we can at least agree that this is a gapping hole  
people can drive trucks though in the type system, but I'm still  
claiming it isn't a feature on theoretic grounds.


:-(

Shaun, want to do up an entry in the manual describing this?  We have  
known about this for years and years, but, we don't do a good job  
communicating it to users.  Essentially, & doesn't work as one would  
expect on unaligned data, as it produces a pointer to an aligned  
object instead of a pointer to unaligned object.  Essentially, we  
don't have a type system that contains pointer to unaligned types.   
The compiler then goes on to make codegen choices based upon the fact  
that the data are known to be aligned, and bad things happen.




Re: memcpy to an unaligned address

2005-08-02 Thread Joe Buck
On Tue, Aug 02, 2005 at 02:04:16PM -0700, Mike Stump wrote:
> Shaun, want to do up an entry in the manual describing this?  We have  
> known about this for years and years, but, we don't do a good job  
> communicating it to users.  Essentially, & doesn't work as one would  
> expect on unaligned data, as it produces a pointer to an aligned  
> object instead of a pointer to unaligned object.

I suppose we could make & on an unaligned project return a void*.  That
isn't really right, but it would at least prevent the cases that we know
don't work from compiling.



Re: memcpy to an unaligned address

2005-08-02 Thread Mike Stump

On Aug 2, 2005, at 1:45 PM, Ian Lance Taylor wrote:

That argument doesn't make sense to me.  memcpy takes a void*
argument, which has no presumed alignment.


The memcpy builtin uses the static type of the actual argument  
(before conversion to void*), to gain hints about the alignments of  
the data coming in.  This is so that we can producing nice fast code  
for 1-16 bytes objects.  This is actually good.  The real problem is  
formation of the address of the member doesn't produce a pointer to  
unaligned type, but rather a pointer to aligned type, this is the  
part that is wrong.  We'd have to add pointers to unaligned data to  
our type system to fix it.  That should be done, but is a hard/big  
job, and no one has stepped forward to do it.




Re: memcpy to an unaligned address

2005-08-02 Thread Joe Buck
On Tue, Aug 02, 2005 at 02:29:44PM -0700, Mike Stump wrote:
> On Aug 2, 2005, at 1:45 PM, Ian Lance Taylor wrote:
> >That argument doesn't make sense to me.  memcpy takes a void*
> >argument, which has no presumed alignment.
> 
> The memcpy builtin uses the static type of the actual argument  
> (before conversion to void*), to gain hints about the alignments of  
> the data coming in.  This is so that we can producing nice fast code  
> for 1-16 bytes objects.  This is actually good.  The real problem is  
> formation of the address of the member doesn't produce a pointer to  
> unaligned type, but rather a pointer to aligned type, this is the  
> part that is wrong.  We'd have to add pointers to unaligned data to  
> our type system to fix it.  That should be done, but is a hard/big  
> job, and no one has stepped forward to do it.

So my suggestion to just make pointers to unaligned objects void* would
work in this case, then.



Re: memcpy to an unaligned address

2005-08-02 Thread Joe Buck
On Tue, Aug 02, 2005 at 04:07:00PM -0600, Shaun Jackman wrote:
> On 8/2/05, Joe Buck <[EMAIL PROTECTED]> wrote:
> > I suppose we could make & on an unaligned project return a void*.  That
> > isn't really right, but it would at least prevent the cases that we know
> > don't work from compiling.
> 
> That sounds like a dangerous idea only because I'd expect...
>   int *p = &packed_struct.unaligned_member;
> ... to fail if unaligned_member is not an int, but if the & operator
> returns a void*, it would suddenly become very permissive.

Ah.  I was thinking as a C++ programmer, where void* cannot be assigned to
int* without an explicit cast.  The decision to allow this in C was the
worst mistake the standards committee made.

The problem is that the type returned by malloc is not just any void*,
but a special pointer that is guaranteed to have alignment sufficient
to store any type.  This is very different from the type of the arguments
to memcpy, which is assumed to have no alignment that can be counted on.



Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Joe Buck <[EMAIL PROTECTED]> wrote:
> I suppose we could make & on an unaligned project return a void*.  That
> isn't really right, but it would at least prevent the cases that we know
> don't work from compiling.

That sounds like a dangerous idea only because I'd expect...
int *p = &packed_struct.unaligned_member;
... to fail if unaligned_member is not an int, but if the & operator
returns a void*, it would suddenly become very permissive.

Cheers,
Shaun


Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Paul Koning <[EMAIL PROTECTED]> wrote:
> It sounds like the workaround is to avoid memcpy, and just use
> variable assignment.  Alternatively, cast the pointers to char*, which
> should force memcpy to do the right thing.  Ugh.

I swear originally, back in the gcc 2.95 days, I used memcpy because
the memcpy function checked for unaligned pointers, whereas storing to
and loading from unaligned variables generated a simple store/load
instruction which wouldn't work. It seems the tables have turned and
the exact opposite is true now with gcc 4, where memcpy doesn't work,
but unaligned variables do. I believe gcc 3 behaved the same as gcc 2
-- memcpy worked, unaligned variables didn't work. Can someone confirm
this summary is correct?

It seems to me there's an argument for a _memcpy_unaligned(3)
function, as ugly as that is.

Cheers,
Shaun


Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Paul Koning <[EMAIL PROTECTED]> wrote:
> It sounds like the workaround is to avoid memcpy, and just use
> variable assignment.  Alternatively, cast the pointers to char*, which
> should force memcpy to do the right thing.  Ugh.

Casting to void* does not work either. gcc keeps the alignment
information -- but not the *unalignment* information, if that
distinction makes any sense -- of a particular variable around as long
as it can, through casts and even through assignment. The unalignment
information, on the other hand, is lost immediately after the &
operator. None of these examples produce an unaligned load:

memcpy(&s->b, &n, sizeof n);

memcpy((void*)&s->b, &n, sizeof n);

void *p = &s->b;
memcpy(p, &n, sizeof n);

But as pointed out by others, this does produce an unaligned load:

s->b = n;

Cheers,
Shaun


Re: memcpy to an unaligned address

2005-08-02 Thread Shaun Jackman
On 8/2/05, Shaun Jackman <[EMAIL PROTECTED]> wrote:
> operator. None of these examples produce an unaligned load:

I should clarify the wording I'm using here. By "an unaligned load" I
mean code to safely load from an unaligned pointer.

Cheers,
Shaun


Re: GCC-3.4.5 status report

2005-08-02 Thread Daniel Berlin
On Tue, 2005-08-02 at 22:07 +0200, Gabriel Dos Reis wrote:
> Hi,
> 
>   The number of open PRs registered as CC-3.4.x regressions only and
> targetted for 3.4.5 has decreased from 125 (last week) to 115.  Which
> is a progress!  Still, we have too many PRs  for a stable branch.
> 
> Here is the complete list as communicated to me by the bugzilla mail
> interface.  Note to Dan: we still miss ways to query for PRs closed
> in specific laps to time.  


I'll work on this, but i probably won't get to it till next week (best
guess).




gcc-3.4-20050802 is now available

2005-08-02 Thread gccadmin
Snapshot gcc-3.4-20050802 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/3.4-20050802/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 3.4 CVS branch
with the following options: -rgcc-ss-3_4-20050802 

You'll find:

gcc-3.4-20050802.tar.bz2  Complete GCC (includes all of below)

gcc-core-3.4-20050802.tar.bz2 C front end and core compiler

gcc-ada-3.4-20050802.tar.bz2  Ada front end and runtime

gcc-g++-3.4-20050802.tar.bz2  C++ front end and runtime

gcc-g77-3.4-20050802.tar.bz2  Fortran 77 front end and runtime

gcc-java-3.4-20050802.tar.bz2 Java front end and runtime

gcc-objc-3.4-20050802.tar.bz2 Objective-C front end and runtime

gcc-testsuite-3.4-20050802.tar.bz2The GCC testsuite

Diffs from 3.4-20050726 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-3.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Attempt for rotating register allocation

2005-08-02 Thread Chunjiang Li
Hi, all 

So far, using the dataflow info (gen with df.c and df.h), I can find out the 
pseudo registers use 
and def in a one bb loop. 

Now, need to establish a struct to record the lifetime activity of each pseudo 
register in a 
software pipelined loop. According to this struct, we can allocate rotating 
registers to each 
pseudo register used and defed in swp loops. 

Although, rotating register allocation for swp in GCC may be interfered by the 
reg alloc process 
in current GCC, which may result in failure.
I also want to try it. During this process, I can grasp more techniques in  the 
back-end of GCC.

Needs advice and cooperations.  

Chunjiang Li

Creative Compiler Research Group,
National University of Defense Technology, China.


Partial Success Building 4.0.1 on x86_64-slackware-linux

2005-08-02 Thread Kurt Wall
I have a partial success on x86_64-slackware-linux. It is partial
because it (mostly) bootstraps (see item 1) and but fails to install
(see item 2).

1. Java compilation repeatedly failed, so I dropped it from the
   languages to build
2. While bootstrap succeeds, "make install" fails with the following
   error:

$ sudo make install
/bin/sh ../gcc-4.0.1/mkinstalldirs /usr/local/gcc401 /usr/local/gcc401
make[1]: Entering directory `/home/kurt/books/gccbook2/gcc-obj/fixincludes'
make[1]: *** No rule to make target `../libiberty/libiberty.a', needed by 
`full-stamp'.  Stop.
make[1]: Leaving directory `/home/kurt/books/gccbook2/gcc-obj/fixincludes'
make: *** [install-fixincludes] Error 2

Output from config.guess: 
x86_64-unknown-linux-gnu

Output from resulting gcc -v:

Languages: 
c,c++,objc

Distribution:
slamd64 (Slackware 64-bit Linux distribution)

Kernel:
Linux easter 2.6.12.3 #1 Fri Jul 29 06:04:06 EDT 2005 x86_64 AMD
Athlon(tm) 64 Processor 3200+ AuthenticAMD GNU/Linux

C Library:
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.3.3.
Compiled on a Linux 2.4.26 system on 2004-05-24.
Available extensions:
GNU libio by Per Bothner
crypt add-on version 2.1 by Michael Glad and others
linuxthreads-0.10 by Xavier Leroy
BIND-8.2.3-T5B
libthread_db work sponsored by Alpha Processor Inc
NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk

./configure invocation:
 ../gcc-4.0.1/configure \
 --disable-nls \
 --with-gnu-gettext \
 --prefix=/usr/local \
 --host=x86_64-slackware-linux \
 --target=x86_64-slackware-linux \
 --enable-languages=c,c++,objc

Kurt
-- 
Know what I hate most?  Rhetorical questions.
-- Henry N. Camp
-- 
"Speed is subsittute fo accurancy."


Request to reopen a PR

2005-08-02 Thread Greg Schafer
Hi

Sorry if this is the wrong address to contact.

This is a minor request for a minor libmudflap problem.

Could somebody with appropriate privilege please do me a favor and reopen
the following bugzilla PR?

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20003

It seems the system won't let me do it because I'm not the original
reporter.

Thanks
Greg