Re: [trans-mem] cgraph edges vs function cloning

2009-07-29 Thread Martin Jambor
Hi,

On Tue, Jul 28, 2009 at 04:26:12PM -0700, Richard Henderson wrote:
> On 07/28/2009 10:44 AM, Richard Henderson wrote:
> >I guess I'll poke at cleaning this up today. I've got to
> >familiarize myself with how virtual clones work...
> 
> The virtual clones that ipa-cp makes seems to be easy.
> 
> My thought here is that since (virtual) clones don't
> have actual bodies (and when they acquire bodies they
> cease to be clones), then there's no reason for them
> to have callee edges at all. 

That is not really true.  Consider the following example:

static void a (int i)
{
  DO_SOMETHING_WITH_I (i);
}

void b (int i)  /* Not static! */
{
  a (i);
}

void c (void)
{
  b (2);
}

After ipa-cp (even today), we might end up with a call graph where c
would call b.clone.0 which in turn would call a.clone.1, while the
original b would still call the original a.

Moreover, I have an ipa-cp improvemant patch that removes callee edges
of virtual clones if it can prove that they are never called after a
constant is substituted into a parameter, like in the example below:

int f(int i)
{
  if (i == 0)
{
  call_some_nasty_function(...);
}
  ...
}

int g() {return f(1);}

Martin


Re: GCC 4.3.4 release candidate available

2009-07-29 Thread Dave Korn
Richard Guenther wrote:
> A release candidate for the GCC 4.3.4 is now available at
> 
> ftp://gcc.gnu.org/pub/gcc/snapshots/4.3.4-RC-20090727
> 
> I plan to roll out the final release at the beginning of next week
> if there are no major problems reported.

  Bootstrap failure on Cygwin:

> /gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/xgcc 
> -B/gn
> u/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/ 
> -B/opt/gcc-t
> ools/i686-pc-cygwin/bin/ -B/opt/gcc-tools/i686-pc-cygwin/lib/ -isystem 
> /opt/gcc-
> tools/i686-pc-cygwin/include -isystem 
> /opt/gcc-tools/i686-pc-cygwin/sys-include
> -c -DHAVE_CONFIG_H -O2 -g -g -O2   -I. -I../.././libiberty/../include  -W 
> -Wall
> -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic  
> ../.././libiberty/st
> rsignal.c -o strsignal.o
> ../.././libiberty/strsignal.c:408: error: conflicting types for 'strsignal'
> /usr/include/string.h:79: error: previous declaration of 'strsignal' was here
> make[2]: *** [strsignal.o] Error 1
> make[2]: Leaving directory 
> `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/i686-p
> c-cygwin/libiberty'
> make[1]: *** [all-target-libiberty] Error 2
> make[1]: Leaving directory `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727'
> make: *** [all] Error 2

  This is PR 38903.  Can we backport the fix?  (I'm just testing it against the
RC tarball right now.)

cheers,
  DaveK




Re: GCC 4.3.4 release candidate available

2009-07-29 Thread Richard Guenther
On Wed, 29 Jul 2009, Dave Korn wrote:

> Richard Guenther wrote:
> > A release candidate for the GCC 4.3.4 is now available at
> > 
> > ftp://gcc.gnu.org/pub/gcc/snapshots/4.3.4-RC-20090727
> > 
> > I plan to roll out the final release at the beginning of next week
> > if there are no major problems reported.
> 
>   Bootstrap failure on Cygwin:
> 
> > /gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/xgcc 
> > -B/gn
> > u/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/ 
> > -B/opt/gcc-t
> > ools/i686-pc-cygwin/bin/ -B/opt/gcc-tools/i686-pc-cygwin/lib/ -isystem 
> > /opt/gcc-
> > tools/i686-pc-cygwin/include -isystem 
> > /opt/gcc-tools/i686-pc-cygwin/sys-include
> > -c -DHAVE_CONFIG_H -O2 -g -g -O2   -I. -I../.././libiberty/../include  -W 
> > -Wall
> > -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic  
> > ../.././libiberty/st
> > rsignal.c -o strsignal.o
> > ../.././libiberty/strsignal.c:408: error: conflicting types for 'strsignal'
> > /usr/include/string.h:79: error: previous declaration of 'strsignal' was 
> > here
> > make[2]: *** [strsignal.o] Error 1
> > make[2]: Leaving directory 
> > `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/i686-p
> > c-cygwin/libiberty'
> > make[1]: *** [all-target-libiberty] Error 2
> > make[1]: Leaving directory `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727'
> > make: *** [all] Error 2
> 
>   This is PR 38903.  Can we backport the fix?  (I'm just testing it against 
> the
> RC tarball right now.)

Yes, if this is ok with the Cygwin maintainers.

Thanks,
Richard.


Re: GCC 4.3.4 release candidate available

2009-07-29 Thread Kai Tietz
2009/7/29 Richard Guenther :
> On Wed, 29 Jul 2009, Dave Korn wrote:
>
>> Richard Guenther wrote:
>> > A release candidate for the GCC 4.3.4 is now available at
>> >
>> > ftp://gcc.gnu.org/pub/gcc/snapshots/4.3.4-RC-20090727
>> >
>> > I plan to roll out the final release at the beginning of next week
>> > if there are no major problems reported.
>>
>>   Bootstrap failure on Cygwin:
>>
>> > /gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/xgcc 
>> > -B/gn
>> > u/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/ 
>> > -B/opt/gcc-t
>> > ools/i686-pc-cygwin/bin/ -B/opt/gcc-tools/i686-pc-cygwin/lib/ -isystem 
>> > /opt/gcc-
>> > tools/i686-pc-cygwin/include -isystem 
>> > /opt/gcc-tools/i686-pc-cygwin/sys-include
>> > -c -DHAVE_CONFIG_H -O2 -g -g -O2   -I. -I../.././libiberty/../include  -W 
>> > -Wall
>> > -Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic  
>> > ../.././libiberty/st
>> > rsignal.c -o strsignal.o
>> > ../.././libiberty/strsignal.c:408: error: conflicting types for 'strsignal'
>> > /usr/include/string.h:79: error: previous declaration of 'strsignal' was 
>> > here
>> > make[2]: *** [strsignal.o] Error 1
>> > make[2]: Leaving directory 
>> > `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/i686-p
>> > c-cygwin/libiberty'
>> > make[1]: *** [all-target-libiberty] Error 2
>> > make[1]: Leaving directory `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727'
>> > make: *** [all] Error 2
>>
>>   This is PR 38903.  Can we backport the fix?  (I'm just testing it against 
>> the
>> RC tarball right now.)
>
> Yes, if this is ok with the Cygwin maintainers.
>
> Thanks,
> Richard.
>

It is fine.

Thanks,
Kai

-- 
|  (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination


Re: [lambda] Segmentation fault in simple lambda program

2009-07-29 Thread Adam Butcher
Hi

Esben Mose Hansen writes:
> this program SEGFAULTs
>
> #include 
>
> int main() {
>   int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
>   const std::size_t nn = sizeof(numbers)/sizeof(int);
>   int sum = 0;
>   int f = 5;
>   std::for_each(&numbers[0], &numbers[nn],  [&]  (int n)  {
> sum += n * f;
>   });
>
> }
> ...
> I am completely new to gcc hacking, just
> dying to get lambda into gcc 4.5 :)
>
Me to on both counts!  So much so that I've got a working copy of the latest 
lambda branch, svnmerged the latest 4.5.0
trunk into it, fixed a few build issues and started poking around.  I have 
never ventured into gcc internals before so
its all a bit alien at the mo.

> On Thursday 30 April 2009 19:19:31 Ian wrote:
> > When I try to specify the capture it works ((&sum, &f) works too but f is
> > const):
> >
> > #include 
> >
> > int
> > main(void)
> > {
> >   int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
> >   const std::size_t nn = sizeof(numbers)/sizeof(int);
> >   int sum = 0;
> >   int f = 5;
> >
> >   //std::for_each(&numbers[0], &numbers[nn], [&](int n) { sum += n * f; });
> >
> >   std::for_each(&numbers[0], &numbers[nn], [&sum, f](int n) { sum += n * f;
> > });
> >
> >   return 0;
> > }
>
>  Yup. In fact, almost any other capture block than the [&] works :) I will try
>  to look at those tree options when I get sober again.
>
It is crashing when invoking a copied 'implicit capture' lambda.  The same 
occurs with:

  int main()
  {
 char i,j,k;
 auto const& f = [&] { i = 1; j = 2; k = 3; };
 auto g = f;
 g();
 return i+j+k;
  }

With explicit captures (i.e. specifying [&i,&j,&k] instead of [&] above) it 
works fine.  Also using f() (in place of
g()) is fine so the code in the lambda and the call to it must be okay.  So I 
started looking into the instance data.

The resulting lambda class in the above program is generated by the compiler to 
look something like the following in
name and structure:

   struct __lambda0
   {
  char &i,&j,&k;
  void operator() () const
  {
 i = 1; j = 2; k = 3;
  }
   };

Looking at the implementation in gcc/cp/{class.c,parser.c,semantics.c} it seems 
that, in implicit capture mode,
references to enclosing scope identifiers are retrospectively added to the 
lambda class on first use in the lambda
body.  This made me worry about the class structure changing as you progress 
through the parse of the lambda body. 
I.e. at the start of the body nothing is captured -- since nothing is 
referenced.  As you meet enclosing scope
references, each is added as a capture member to the lambda class.  Is this 
okay or has something already decided on
the size and structure of the class?  I figured (almost certainly naively and 
incorrectly) that it ought to be similar
to the difference between:

   struct X
   {
  int i;
  void init_i() { i = 1; }
   };

and

   struct X
   {
  void init_i() { i = 1; }
  int i;
   };

I changed the program above to check the sizes of the generated lambda class 
and it is indeed as expected (three
pointers). So the class has the correct size -- why does it not copy?  Surely a 
bitwise copy in this case is
sufficient and that ought to be auto-generated.  -- Except we're in compiler 
land here -- are we supposed to do the
auto-generating?  To test the theory I added

 memcpy(&g,&f,sizeof g);

after the assignment (auto g = f) to force the instance data to be copied from 
f to g.  It worked!  So why is the
compiler not generating suitable code for the lambda class copy -- the size is 
right, but no copy of instance data is
made -- maybe its already decided that the size is zero (as it was before the 
lambda body) and generated 'do-nothing'
copy constructors?

I had a poke around in gcc/cp/{class.c,parser.c,semantics.c} and believe I have 
worked around the issue -- the proper
solution I don't think is as straight-forward (or maybe its more 
straight-forward for a gcc guru?).

Note that the following diff hunk is after an svnmerge of trunk commits into my 
wc of the lambda branch so offsets
will likely be wrong.  I would give a full diff but my wc is in rather a messy 
state at the mo.  I have no idea
whether there are any nasty side effects to doing this but it seems to do the 
copy correctly afterwards.  I have not
looked into it much further at the mo.  Thought I'd just post my findings.

### snip ##
--- gcc/cp/parser.c (revision 150148)
+++ gcc/cp/parser.c (working copy)
@@ -6936,6 +6991,24 @@

 cp_parser_lambda_body (parser, lambda_expr);

+/* relayout again -- to allow for implicit
+ * parameters to have been added to the capture if it was a
+ * 'default capture' -- note that this would not be necessary if
+ * the stack-pointer variant was implemented -- since the layout
+ * would be known.
+ * Relayingout here might have nasty effect if one were to query
+ * sizeof *this from within the body -- would that ev

m68k - GCC 4.4.0 generates not so good code from asm inline

2009-07-29 Thread ami_stuff
Hi,

Here is a C source code which I compiled with GCC 3.4.0 and GCC 4.4.0. GCC 
3.4.0 output looks a lot better.

#include 
#include 

#define umul_ppmm(xh, xl, a, b) \
__asm__ ("| Inlined umul_ppmm\n" \
" move.l %0,%/d5\n" \
" move.l %1,%/d4\n" \
" moveq #16,%/d3\n" \
" move.l %0,%/d2\n" \
" mulu %1,%0\n" \
" lsr.l %/d3,%/d4\n" \
" lsr.l %/d3,%/d5\n" \
" mulu %/d4,%/d2\n" \
" mulu %/d5,%1\n" \
" mulu %/d5,%/d4\n" \
" move.l %/d2,%/d5\n" \
" lsr.l %/d3,%/d2\n" \
" add.w %1,%/d5\n" \
" addx.l %/d2,%/d4\n" \
" lsl.l %/d3,%/d5\n" \
" lsr.l %/d3,%1\n" \
" add.l %/d5,%0\n" \
" addx.l %/d4,%1" \
: "=d" ((uint32_t) (xl)), "=d" ((uint32_t) (xh)) \
: "0" ((uint32_t) (a)), "1" ((uint32_t) (b)) \
: "d2", "d3", "d4", "d5")

inline int64_t MUL64(int a, int b)
{
uint32_t au = a;
uint32_t bu = b;

uint32_t resh, resl;
uint64_t res;

umul_ppmm(resh, resl, au, bu);

if (a < 0)
resh -= bu;
if (b < 0)
resh -= au;

res = ((uint64_t)resh << 32) | resl;

return res;
} 


GCC 4.4.0 asm output:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
movem.l #16128,-(sp)
move.l 28(sp),d0
move.l 32(sp),a0
move.l d0,d6
move.l a0,d1
#APP
;# 36 "mul642.c" 1
| Inlined umul_ppmm
move.l d6,d5
move.l d1,d4
moveq #16,d3
move.l d6,d2
mulu d1,d6
lsr.l d3,d4
lsr.l d3,d5
mulu d4,d2
mulu d5,d1
mulu d5,d4
move.l d2,d5
lsr.l d3,d2
add.w d1,d5
addx.l d2,d4
lsl.l d3,d5
lsr.l d3,d1
add.l d5,d6
addx.l d4,d1
#NO_APP
tst.l d0
jlt L6
tst.l a0
jlt L7
L3:
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L7:
sub.l d0,d1
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L6:
sub.l a0,d1
tst.l a0
jge L3
jra L7 

GCC 3.4.0 asm output:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
moveml #0x3f00,s...@-
movel sp@(28),d1
movel sp@(32),d0
movel d1,d7
movel d0,d6
#APP
| Inlined umul_ppmm
move.l d7,d5
move.l d6,d4
moveq #16,d3
move.l d7,d2
mulu d6,d7
lsr.l d3,d4
lsr.l d3,d5
mulu d4,d2
mulu d5,d6
mulu d5,d4
move.l d2,d5
lsr.l d3,d2
add.w d6,d5
addx.l d2,d4
lsl.l d3,d5
lsr.l d3,d6
add.l d5,d7
addx.l d4,d6
#NO_APP
tstl d1
jlt L5
tstl d0
jge L3
jra L6
.even
L5:
subl d0,d6
tstl d0
jge L3
.even
L6:
subl d1,d6
.even
L3:
movel d6,d0
clrl d1
orl d7,d1
moveml s...@+,#0xfc
rts 

Is it a regression?

Regards



Re: [lambda] Segmentation fault in simple lambda program

2009-07-29 Thread Ed Smith-Rowland

Adam Butcher wrote:

Hi

Esben Mose Hansen writes:
  

this program SEGFAULTs

#include 

int main() {
  int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
  const std::size_t nn = sizeof(numbers)/sizeof(int);
  int sum = 0;
  int f = 5;
  std::for_each(&numbers[0], &numbers[nn],  [&]  (int n)  {
sum += n * f;
  });

}
...
I am completely new to gcc hacking, just
dying to get lambda into gcc 4.5 :)



Me to on both counts!  So much so that I've got a working copy of the latest 
lambda branch, svnmerged the latest 4.5.0
trunk into it, fixed a few build issues and started poking around.  I have 
never ventured into gcc internals before so
its all a bit alien at the mo.

  

On Thursday 30 April 2009 19:19:31 Ian wrote:


When I try to specify the capture it works ((&sum, &f) works too but f is
const):

#include 

int
main(void)
{
  int numbers[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
  const std::size_t nn = sizeof(numbers)/sizeof(int);
  int sum = 0;
  int f = 5;

  //std::for_each(&numbers[0], &numbers[nn], [&](int n) { sum += n * f; });

  std::for_each(&numbers[0], &numbers[nn], [&sum, f](int n) { sum += n * f;
});

  return 0;
}
  

 Yup. In fact, almost any other capture block than the [&] works :) I will try
 to look at those tree options when I get sober again.



It is crashing when invoking a copied 'implicit capture' lambda.  The same 
occurs with:

  int main()
  {
 char i,j,k;
 auto const& f = [&] { i = 1; j = 2; k = 3; };
 auto g = f;
 g();
 return i+j+k;
  }

With explicit captures (i.e. specifying [&i,&j,&k] instead of [&] above) it 
works fine.  Also using f() (in place of
g()) is fine so the code in the lambda and the call to it must be okay.  So I 
started looking into the instance data.

The resulting lambda class in the above program is generated by the compiler to 
look something like the following in
name and structure:

   struct __lambda0
   {
  char &i,&j,&k;
  void operator() () const
  {
 i = 1; j = 2; k = 3;
  }
   };

Looking at the implementation in gcc/cp/{class.c,parser.c,semantics.c} it seems 
that, in implicit capture mode,
references to enclosing scope identifiers are retrospectively added to the 
lambda class on first use in the lambda
body.  This made me worry about the class structure changing as you progress through the parse of the lambda body. 
I.e. at the start of the body nothing is captured -- since nothing is referenced.  As you meet enclosing scope

references, each is added as a capture member to the lambda class.  Is this 
okay or has something already decided on
the size and structure of the class?  I figured (almost certainly naively and 
incorrectly) that it ought to be similar
to the difference between:

   struct X
   {
  int i;
  void init_i() { i = 1; }
   };

and

   struct X
   {
  void init_i() { i = 1; }
  int i;
   };

I changed the program above to check the sizes of the generated lambda class 
and it is indeed as expected (three
pointers). So the class has the correct size -- why does it not copy?  Surely a 
bitwise copy in this case is
sufficient and that ought to be auto-generated.  -- Except we're in compiler 
land here -- are we supposed to do the
auto-generating?  To test the theory I added

 memcpy(&g,&f,sizeof g);

after the assignment (auto g = f) to force the instance data to be copied from 
f to g.  It worked!  So why is the
compiler not generating suitable code for the lambda class copy -- the size is 
right, but no copy of instance data is
made -- maybe its already decided that the size is zero (as it was before the 
lambda body) and generated 'do-nothing'
copy constructors?

I had a poke around in gcc/cp/{class.c,parser.c,semantics.c} and believe I have 
worked around the issue -- the proper
solution I don't think is as straight-forward (or maybe its more 
straight-forward for a gcc guru?).

Note that the following diff hunk is after an svnmerge of trunk commits into my 
wc of the lambda branch so offsets
will likely be wrong.  I would give a full diff but my wc is in rather a messy 
state at the mo.  I have no idea
whether there are any nasty side effects to doing this but it seems to do the 
copy correctly afterwards.  I have not
looked into it much further at the mo.  Thought I'd just post my findings.

### snip ##
--- gcc/cp/parser.c (revision 150148)
+++ gcc/cp/parser.c (working copy)
@@ -6936,6 +6991,24 @@

 cp_parser_lambda_body (parser, lambda_expr);

+/* relayout again -- to allow for implicit
+ * parameters to have been added to the capture if it was a
+ * 'default capture' -- note that this would not be necessary if
+ * the stack-pointer variant was implemented -- since the layout
+ * would be known.
+ * Relayingout here might have nasty effect if one were to query
+ * sizeof *this from within the body -- would that even be
+ * possible -- *this would refer to the lambda or 

MELT tutorial on the wiki

2009-07-29 Thread Basile STARYNKEVITCH

Hello All,

I added as a turorial on the wiki, a small MELT pass which warns against 
fprintf(stdout, ...) in the compiled code.


For GCC hackers, is the page
http://gcc.gnu.org/wiki/writing%20a%20pass%20in%20MELT
clear enough?

(of course, the pass implementation is imperfect, but I have hard time 
finding good *small* examples).


regards.
--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mines, sont seulement les miennes} ***


Re: GCC 4.3.4 release candidate available

2009-07-29 Thread Angelo Graziosi

Dave Korn wrote:


Bootstrap failure on Cygwin:


/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/xgcc -B/gn
u/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/host-i686-pc-cygwin/gcc/ -B/opt/gcc-t
ools/i686-pc-cygwin/bin/ -B/opt/gcc-tools/i686-pc-cygwin/lib/ -isystem /opt/gcc-
tools/i686-pc-cygwin/include -isystem /opt/gcc-tools/i686-pc-cygwin/sys-include
-c -DHAVE_CONFIG_H -O2 -g -g -O2   -I. -I../.././libiberty/../include  -W -Wall
-Wwrite-strings -Wc++-compat -Wstrict-prototypes -pedantic  ../.././libiberty/st
rsignal.c -o strsignal.o
../.././libiberty/strsignal.c:408: error: conflicting types for 'strsignal'
/usr/include/string.h:79: error: previous declaration of 'strsignal' was here
make[2]: *** [strsignal.o] Error 1
make[2]: Leaving directory `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727/i686-p
c-cygwin/libiberty'
make[1]: *** [all-target-libiberty] Error 2
make[1]: Leaving directory `/gnu/gcc/releases/4.3.4/gcc-4.3.4-RC-20090727'
make: *** [all] Error 2


For the sake of completeness, I want to flag that on *Cygwin-1.5*, the 
gcc-4.3.4-RC-20090727 snapshot, configured as:


${gcc_dir}/configure --prefix="${prefix_dir}" \
  --exec-prefix="${eprefix_dir}" \
  --sysconfdir="${sysconf_dir}" \
  --libdir="${lib_dir}" \
  --libexecdir="${libexec_dir}" \
  --mandir="${man_dir}" \
  --infodir="${info_dir}" \
  --program-suffix="${suffix}" \
  --enable-bootstrap \
  --enable-checking=release \
  --enable-decimal-float=bid \
  --enable-languages=c,c++,fortran \
  --enable-libgomp \
  --enable-libssp \
  --enable-nls \
  --enable-threads=posix \
  --enable-version-specific-runtime-libs \
  --disable-fixed-point \
  --disable-libmudflap \
  --disable-shared \
  --disable-sjlj-exceptions \
  --disable-win32-registry \
  --with-arch=i686 \
  --with-dwarf2 \
  --with-system-zlib \
  --with-tune=generic \
  --without-included-gettext \
  --without-x

builds OB.


Cheers,
Angelo.


Re: [trans-mem] cgraph edges vs function cloning

2009-07-29 Thread Richard Henderson

On 07/29/2009 12:28 AM, Martin Jambor wrote:

That is not really true.


Whee.  I guess I really didn't understand what was going on.

At present I'm testing the following minimal patch to fix
the double lookup problem I mentioned.  It's uglier than
it ought to be because of incorrect indentation in the
original.


r~
--- tree-inline.c   (revision 150180)
+++ tree-inline.c   (local)
@@ -1496,67 +1496,69 @@ copy_bb (copy_body_data *id, basic_block
 callgraph edges and update or duplicate them.  */
  if (is_gimple_call (stmt))
{
- struct cgraph_edge *edge = cgraph_edge (id->src_node, orig_stmt);
+ struct cgraph_edge *edge;
  int flags;
 
  switch (id->transform_call_graph_edges)
{
- case CB_CGE_DUPLICATE:
-   if (edge)
- cgraph_clone_edge (edge, id->dst_node, stmt,
-  REG_BR_PROB_BASE, 1,
-  edge->frequency, true);
-   break;
-
- case CB_CGE_MOVE_CLONES:
-   cgraph_set_call_stmt_including_clones (id->dst_node, orig_stmt, 
stmt);
-   break;
-
- case CB_CGE_MOVE:
-   if (edge)
- cgraph_set_call_stmt (edge, stmt);
-   break;
+   case CB_CGE_DUPLICATE:
+ edge = cgraph_edge (id->src_node, orig_stmt);
+ if (edge)
+   edge = cgraph_clone_edge (edge, id->dst_node, stmt,
+ REG_BR_PROB_BASE, 1,
+ edge->frequency, true);
+ break;
+
+   case CB_CGE_MOVE_CLONES:
+ cgraph_set_call_stmt_including_clones (id->dst_node,
+orig_stmt, stmt);
+ edge = cgraph_edge (id->dst_node, stmt);
+ break;
+
+   case CB_CGE_MOVE:
+ edge = cgraph_edge (id->dst_node, orig_stmt);
+ if (edge)
+   cgraph_set_call_stmt (edge, stmt);
+ break;
 
- default:
-   gcc_unreachable ();
+   default:
+ gcc_unreachable ();
}
 
-   edge = cgraph_edge (id->src_node, orig_stmt);
-   /* Constant propagation on argument done during inlining
-  may create new direct call.  Produce an edge for it.  */
-   if ((!edge 
-|| (edge->indirect_call
-&& id->transform_call_graph_edges == CB_CGE_MOVE_CLONES))
-   && is_gimple_call (stmt)
-   && (fn = gimple_call_fndecl (stmt)) != NULL)
- {
-   struct cgraph_node *dest = cgraph_node (fn);
+ /* Constant propagation on argument done during inlining
+may create new direct call.  Produce an edge for it.  */
+ if ((!edge 
+  || (edge->indirect_call
+  && id->transform_call_graph_edges == CB_CGE_MOVE_CLONES))
+ && is_gimple_call (stmt)
+ && (fn = gimple_call_fndecl (stmt)) != NULL)
+   {
+ struct cgraph_node *dest = cgraph_node (fn);
 
-   /* We have missing edge in the callgraph.  This can happen in 
one case
-  where previous inlining turned indirect call into direct 
call by
-  constant propagating arguments.  In all other cases we hit a 
bug
-  (incorrect node sharing is most common reason for missing 
edges.  */
-   gcc_assert (dest->needed || !dest->analyzed);
-   if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES)
- cgraph_create_edge_including_clones (id->dst_node, dest, stmt,
-  bb->count,
-  
compute_call_stmt_bb_frequency (id->dst_node->decl, bb),
-  bb->loop_depth,
-  
CIF_ORIGINALLY_INDIRECT_CALL);
-   else
- cgraph_create_edge (id->dst_node, dest, stmt,
- bb->count, CGRAPH_FREQ_BASE,
- bb->loop_depth)->inline_failed
-   = CIF_ORIGINALLY_INDIRECT_CALL;
-   if (dump_file)
- {
-fprintf (dump_file, "Created new direct edge to %s",
- cgraph_node_name (dest));
- }
- }
+ /* We have missing edge in the callgraph.  This can happen
+when previous inlining turned an indirect call into a
+direct call by constant propagating arguments.  In all
+ot

Re: m68k - GCC 4.4.0 generates not so good code from asm inline

2009-07-29 Thread ami_stuff
When I use -O1 with GCC 4.4.0 (-m68060 -fomit-frame-pointer), I get better code.

#include 
#include 

inline int64_t MUL64(int a, int b)
{

uint32_t resh, resl;
uint32_t au = a;
uint32_t bu = b;

__asm__ ("move.l %0, d5\n\t"
"move.l %1, d4\n\t"
"moveq #16, d3\n\t"
"move.l %0, d2\n\t"
"mulu %1, %0\n\t"
"lsr.l d3, d4\n\t"
"lsr.l d3, d5\n\t"
"mulu d4, d2\n\t"
"mulu d5, %1\n\t"
"mulu d5, d4\n\t"
"move.l d2, d5\n\t"
"lsr.l d3, d2\n\t"
"add.w %1, d5\n\t"
"addx.l d2, d4\n\t"
"lsl.l d3, d5\n\t"
"lsr.l d3, %1\n\t"
"add.l d5, %0\n\t"
"addx.l d4, %1\n\t"
: "=d"(resl), "=d"(resh)
: "0"(au), "1"(bu)
: "d2", "d3", "d4", "d5");

if (a < 0)
resh -= bu;
if (b < 0)
resh -= au;

return ((uint64_t)resh << 32) | resl;
}

GCC 4.4.0 -O3:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
movem.l #16128,-(sp)
move.l 28(sp),d0
move.l 32(sp),a0
move.l d0,d6
move.l a0,d1
#APP
;# 11 "mul645.c" 1
move.l d6, d5
move.l d1, d4
moveq #16, d3
move.l d6, d2
mulu d1, d6
lsr.l d3, d4
lsr.l d3, d5
mulu d4, d2
mulu d5, d1
mulu d5, d4
move.l d2, d5
lsr.l d3, d2
add.w d1, d5
addx.l d2, d4
lsl.l d3, d5
lsr.l d3, d1
add.l d5, d6
addx.l d4, d1

#NO_APP
tst.l d0
jlt L6
tst.l a0
jlt L7
L3:
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L7:
sub.l d0,d1
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L6:
sub.l a0,d1
tst.l a0
jge L3
jra L7

GCC 4.4.0 -O2:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
movem.l #16128,-(sp)
move.l 28(sp),d0
move.l 32(sp),a0
move.l d0,d6
move.l a0,d1
#APP
;# 11 "mul645.c" 1
move.l d6, d5
move.l d1, d4
moveq #16, d3
move.l d6, d2
mulu d1, d6
lsr.l d3, d4
lsr.l d3, d5
mulu d4, d2
mulu d5, d1
mulu d5, d4
move.l d2, d5
lsr.l d3, d2
add.w d1, d5
addx.l d2, d4
lsl.l d3, d5
lsr.l d3, d1
add.l d5, d6
addx.l d4, d1

#NO_APP
tst.l d0
jlt L6
tst.l a0
jlt L7
L3:
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L7:
sub.l d0,d1
move.l d1,d2
clr.l d3
move.l d2,d0
move.l d3,d1
or.l d6,d1
move.l d0,d6
move.l d1,d7
move.l d7,d1
movem.l (sp)+,#252
rts
L6:
sub.l a0,d1
tst.l a0
jge L3
jra L7

GCC 4.4.0 -O1:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
movem.l #16176,-(sp)
move.l 40(sp),d0
move.l 36(sp),a2
move.l a2,d7
move.l d0,d6
#APP
;# 11 "mul645.c" 1
move.l d7, d5
move.l d6, d4
moveq #16, d3
move.l d7, d2
mulu d6, d7
lsr.l d3, d4
lsr.l d3, d5
mulu d4, d2
mulu d5, d6
mulu d5, d4
move.l d2, d5
lsr.l d3, d2
add.w d6, d5
addx.l d2, d4
lsl.l d3, d5
lsr.l d3, d6
add.l d5, d7
addx.l d4, d6

#NO_APP
tst.l a2
jge L2
sub.l d0,d6
L2:
tst.l d0
jge L3
sub.l a2,d6
L3:
move.l d6,d1
clr.l d2
or.l d7,d2
move.l d1,d0
move.l d2,d1
movem.l (sp)+,#3324
rts

GCC 4.4.0 -O0:

#NO_APP
.text
.even
.globl _MUL64
_MUL64:
lea (-16,sp),sp
movem.l #16128,-(sp)
move.l 44(sp),32(sp)
move.l 48(sp),36(sp)
move.l 32(sp),d1
move.l 36(sp),d0
#APP
;# 11 "mul645.c" 1
move.l d1, d5
move.l d0, d4
moveq #16, d3
move.l d1, d2
mulu d0, d1
lsr.l d3, d4
lsr.l d3, d5
mulu d4, d2
mulu d5, d0
mulu d5, d4
move.l d2, d5
lsr.l d3, d2
add.w d0, d5
addx.l d2, d4
lsl.l d3, d5
lsr.l d3, d0
add.l d5, d1
addx.l d4, d0

#NO_APP
move.l d1,28(sp)
move.l d0,24(sp)
tst.l 44(sp)
jge L2
move.l 36(sp),d0
sub.l d0,24(sp)
L2:
tst.l 48(sp)
jge L3
move.l 32(sp),d2
sub.l d2,24(sp)
L3:
move.l 24(sp),d7
clr.l d6
move.l d7,d0
clr.l d1
move.l 28(sp),a1
lea 0.w,a0
move.l a0,d2
move.l a1,d3
or.l d2,d0
or.l d3,d1
movem.l (sp)+,#252
lea (16,sp),sp
rts 

Regards



Re: m68k - GCC 4.4.0 generates not so good code from asm inline

2009-07-29 Thread Bernd Roesch
Hello 

On 29.07.09, you wrote:

if you have a account you can report that as a Bug.
gcc4 have the advantage its possible to switch in source optimizer on or
off, but how it work, i dont know.

> When I use -O1 with GCC 4.4.0 (-m68060 -fomit-frame-pointer), I get better
> code.
> 
> #include 
> #include 
> 
> inline int64_t MUL64(int a, int b)
> {
> 
> uint32_t resh, resl;
> uint32_t au = a;
> uint32_t bu = b;
> 
> __asm__ ("move.l %0, d5\n\t"
> "move.l %1, d4\n\t"
> "moveq #16, d3\n\t"
> "move.l %0, d2\n\t"
> "mulu %1, %0\n\t"
> "lsr.l d3, d4\n\t"
> "lsr.l d3, d5\n\t"
> "mulu d4, d2\n\t"
> "mulu d5, %1\n\t"
> "mulu d5, d4\n\t"
> "move.l d2, d5\n\t"
> "lsr.l d3, d2\n\t"
> "add.w %1, d5\n\t"
> "addx.l d2, d4\n\t"
> "lsl.l d3, d5\n\t"
> "lsr.l d3, %1\n\t"
> "add.l d5, %0\n\t"
> "addx.l d4, %1\n\t"
> : "=d"(resl), "=d"(resh)
> : "0"(au), "1"(bu)
> : "d2", "d3", "d4", "d5");
> 
> if (a < 0)
> resh -= bu;
> if (b < 0)
> resh -= au;
> 
> return ((uint64_t)resh << 32) | resl;
> }
> 
> GCC 4.4.0 -O3:
> 
> #NO_APP
> .text
> .even
> .globl _MUL64
> _MUL64:
> movem.l #16128,-(sp)
> move.l 28(sp),d0
> move.l 32(sp),a0
> move.l d0,d6
> move.l a0,d1
> #APP
> ;# 11 "mul645.c" 1
> move.l d6, d5
> move.l d1, d4
> moveq #16, d3
> move.l d6, d2
> mulu d1, d6
> lsr.l d3, d4
> lsr.l d3, d5
> mulu d4, d2
> mulu d5, d1
> mulu d5, d4
> move.l d2, d5
> lsr.l d3, d2
> add.w d1, d5
> addx.l d2, d4
> lsl.l d3, d5
> lsr.l d3, d1
> add.l d5, d6
> addx.l d4, d1
> 
> #NO_APP
> tst.l d0
> jlt L6
> tst.l a0
> jlt L7
> L3:
> move.l d1,d2
> clr.l d3
> move.l d2,d0
> move.l d3,d1
> or.l d6,d1
> move.l d0,d6
> move.l d1,d7
> move.l d7,d1
> movem.l (sp)+,#252
> rts
> L7:
> sub.l d0,d1
> move.l d1,d2
> clr.l d3
> move.l d2,d0
> move.l d3,d1
> or.l d6,d1
> move.l d0,d6
> move.l d1,d7
> move.l d7,d1
> movem.l (sp)+,#252
> rts
> L6:
> sub.l a0,d1
> tst.l a0
> jge L3
> jra L7
> 
> GCC 4.4.0 -O2:
> 
> #NO_APP
> .text
> .even
> .globl _MUL64
> _MUL64:
> movem.l #16128,-(sp)
> move.l 28(sp),d0
> move.l 32(sp),a0
> move.l d0,d6
> move.l a0,d1
> #APP
> ;# 11 "mul645.c" 1
> move.l d6, d5
> move.l d1, d4
> moveq #16, d3
> move.l d6, d2
> mulu d1, d6
> lsr.l d3, d4
> lsr.l d3, d5
> mulu d4, d2
> mulu d5, d1
> mulu d5, d4
> move.l d2, d5
> lsr.l d3, d2
> add.w d1, d5
> addx.l d2, d4
> lsl.l d3, d5
> lsr.l d3, d1
> add.l d5, d6
> addx.l d4, d1
> 
> #NO_APP
> tst.l d0
> jlt L6
> tst.l a0
> jlt L7
> L3:
> move.l d1,d2
> clr.l d3
> move.l d2,d0
> move.l d3,d1
> or.l d6,d1
> move.l d0,d6
> move.l d1,d7
> move.l d7,d1
> movem.l (sp)+,#252
> rts
> L7:
> sub.l d0,d1
> move.l d1,d2
> clr.l d3
> move.l d2,d0
> move.l d3,d1
> or.l d6,d1
> move.l d0,d6
> move.l d1,d7
> move.l d7,d1
> movem.l (sp)+,#252
> rts
> L6:
> sub.l a0,d1
> tst.l a0
> jge L3
> jra L7
> 
> GCC 4.4.0 -O1:
> 
> #NO_APP
> .text
> .even
> .globl _MUL64
> _MUL64:
> movem.l #16176,-(sp)
> move.l 40(sp),d0
> move.l 36(sp),a2
> move.l a2,d7
> move.l d0,d6
> #APP
> ;# 11 "mul645.c" 1
> move.l d7, d5
> move.l d6, d4
> moveq #16, d3
> move.l d7, d2
> mulu d6, d7
> lsr.l d3, d4
> lsr.l d3, d5
> mulu d4, d2
> mulu d5, d6
> mulu d5, d4
> move.l d2, d5
> lsr.l d3, d2
> add.w d6, d5
> addx.l d2, d4
> lsl.l d3, d5
> lsr.l d3, d6
> add.l d5, d7
> addx.l d4, d6
> 
> #NO_APP
> tst.l a2
> jge L2
> sub.l d0,d6
> L2:
> tst.l d0
> jge L3
> sub.l a2,d6
> L3:
> move.l d6,d1
> clr.l d2
> or.l d7,d2
> move.l d1,d0
> move.l d2,d1
> movem.l (sp)+,#3324
> rts
> 
> GCC 4.4.0 -O0:
> 
> #NO_APP
> .text
> .even
> .globl _MUL64
> _MUL64:
> lea (-16,sp),sp
> movem.l #16128,-(sp)
> move.l 44(sp),32(sp)
> move.l 48(sp),36(sp)
> move.l 32(sp),d1
> move.l 36(sp),d0
> #APP
> ;# 11 "mul645.c" 1
> move.l d1, d5
> move.l d0, d4
> moveq #16, d3
> move.l d1, d2
> mulu d0, d1
> lsr.l d3, d4
> lsr.l d3, d5
> mulu d4, d2
> mulu d5, d0
> mulu d5, d4
> move.l d2, d5
> lsr.l d3, d2
> add.w d0, d5
> addx.l d2, d4
> lsl.l d3, d5
> lsr.l d3, d0
> add.l d5, d1
> addx.l d4, d0
> 
> #NO_APP
> move.l d1,28(sp)
> move.l d0,24(sp)
> tst.l 44(sp)
> jge L2
> move.l 36(sp),d0
> sub.l d0,24(sp)
> L2:
> tst.l 48(sp)
> jge L3
> move.l 32(sp),d2
> sub.l d2,24(sp)
> L3:
> move.l 24(sp),d7
> clr.l d6
> move.l d7,d0
> clr.l d1
> move.l 28(sp),a1
> lea 0.w,a0
> move.l a0,d2
> move.l a1,d3
> or.l d2,d0
> or.l d3,d1
> movem.l (sp)+,#252
> lea (16,sp),sp
> rts 
> 
> Regards
> 
Regards



GCC 4.5 Status Report (2009-07-29)

2009-07-29 Thread Joseph S. Myers
Status
==

Trunk is in Stage 1.  We expect that Stage 1 will last through at
least the end of August.

Pending large merges include at least Graphite, LTO and VTA and these
will be considered in deciding when to move to Stage 3.  All these
merges will need the usual technical review of patches where not
already approved by maintainers of the relevant parts of the compiler.

The pending Graphite merge was expected to be ready "mid-to-late July"
and was stated in  to
continue using the same cloog-ppl version as GCC 4.4.

Some VTA patches have been posted and a roadmap for merging that
branch is in .

Plans for making LTO ready to merge and providing evidence from test
results that it is ready to merge were posted and discussed at
.

People seeking to have other non-bug-fix patches included in 4.5
should aim to post them in time to be reviewed before the end of Stage 1.
Features not included by then are likely to be delayed to 4.6 unless
the relevant maintainers consider them low enough risk to include in
4.5 (e.g. new ports not requiring changes to target-independent parts
of the compiler).

Quality Data


Priority  # Change from Last Report
--- ---
P1   16 +  2
P2   97 +  4
P34 - 12
--- ---
Total   117 -  6

Previous Report
===

http://gcc.gnu.org/ml/gcc/2009-07/msg00274.html

The next report for 4.5.0 will be sent by Mark.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Cannot configure gcc4.4.0 in order to build h8300 crosscompiler

2009-07-29 Thread ariga masahiro

Hello Sumanth, and everyone,

Thank you for your reply.

Sumanth wrote,

Try changing MINIMUM_UNITS_PER_WORD to greater than or equal to 2 in ur
target.h file


I think you mean MIN_UNITS_PER_WORD in gcc/config/h8300/h8300.h.

Unfortunately it was defined 2.

Here are excerpts from gcc/config/h8300/h8300.h that is converted according 
to patch,
which has been previously sent as 
patch-sources/gcc4.4.0-patch-converted-sources/h8300.h.


-- excerpts begin
/* Define this if most significant bit is lowest numbered
  in instructions that operate on numbered bit-fields.
  This is not true on the H8/300.  */
#define BITS_BIG_ENDIAN 0

/* Define this if most significant byte of a word is the lowest numbered. 
*/

/* That is true on the H8/300.  */
#define BYTES_BIG_ENDIAN 1

/* Define this if most significant word of a multiword number is lowest
  numbered.  */
#define WORDS_BIG_ENDIAN 1

#define MAX_BITS_PER_WORD 32

/* Width of a word, in units (bytes).  */
#define UNITS_PER_WORD  (TARGET_H8300H || TARGET_H8300S ? 4 : 2)
#define MIN_UNITS_PER_WORD 2

#define SHORT_TYPE_SIZE 16
#define INT_TYPE_SIZE  (TARGET_INT32 ? 32 : 16)
#define LONG_TYPE_SIZE  32

//20090727 patch
//#define LONG_LONG_TYPE_SIZE 64
#define LONG_LONG_TYPE_SIZE (TARGET_INT32 ? 64 : 32)

#define FLOAT_TYPE_SIZE 32

//20090727 patch
//#define DOUBLE_TYPE_SIZE 32
#define DOUBLE_TYPE_SIZE (TARGET_INT32 ? 64 : 32)

#define LONG_DOUBLE_TYPE_SIZE DOUBLE_TYPE_SIZE

#define MAX_FIXED_MODE_SIZE 32

/* Allocation boundary (in *bits*) for storing arguments in argument list. 
*/

#define PARM_BOUNDARY (TARGET_H8300H || TARGET_H8300S ? 32 : 16)

/* Allocation boundary (in *bits*) for the code of a function.  */
#define FUNCTION_BOUNDARY 16
-- excerpts end

Masahiro Ariga




Re: Cannot configure gcc4.4.0 in order to build h8300 crosscompiler

2009-07-29 Thread ariga masahiro
Hello,I found next patch in GCC project site.a.. From: Eric Christopher 
 b.. To: gcc mailing list org> c.. Cc: Pompapathi V Gadad  
d.. Date: Tue, 3 Jul 2007 08:35:21 -0700 e.. Subject: [patch] conditionally 
declare bswap functions depending on target f.. References: 
<4678c015.50...@nsc.com> <89cef06d-0a74-4413-b7c0-4d7eb393c...@apple.com> 
<467a0120.1080...@nsc.com> Do you think it is effective ?Index: libgcc2.h

===
--- libgcc2.h   (revision 126242)
+++ libgcc2.h   (working copy)
@@ -342,18 +342,23 @@ extern UWtype __udiv_w_sdiv (UWtype *, U
extern word_type __cmpdi2 (DWtype, DWtype);
extern word_type __ucmpdi2 (DWtype, DWtype);

+#if MIN_UNITS_PER_WORD > 1
+extern SItype __bswapsi2 (SItype);
+#endif
+#if LONG_LONG_TYPE_SIZE > 32
+extern DItype __bswapdi2 (DItype);
+#endif
+
extern Wtype __absvSI2 (Wtype);
extern Wtype __addvSI3 (Wtype, Wtype);
extern Wtype __subvSI3 (Wtype, Wtype);
extern Wtype __mulvSI3 (Wtype, Wtype);
extern Wtype __negvSI2 (Wtype);
-extern SItype __bswapsi2 (SItype);
extern DWtype __absvDI2 (DWtype);
extern DWtype __addvDI3 (DWtype, DWtype);
extern DWtype __subvDI3 (DWtype, DWtype);
extern DWtype __mulvDI3 (DWtype, DWtype);
extern DWtype __negvDI2 (DWtype);
-extern DItype __bswapdi2 (DItype);

#ifdef COMPAT_SIMODE_TRAPPING_ARITHMETIC
extern SItype __absvsi2 (SItype);
I tried to patch GCC4.4.0 libgcc2.h then resulted next message and did not 
change file at all.$ patch -p0 < libgcc2_h_diff.txt(Stripping trailing CRs 
from patch.)patching file libgcc2.hReversed (or previously applied) patch 
detected!  Assume -R? [n]Apply anyway? [n]Skipping patch.1 out of 1 hunk 
ignored -- saving rejects to file libgcc2.h.rejI am not deft of handling 
patch file,so would you please modify it to apply to GCC4.4.0 libgcc2.h ?And 
please teach me how to apply it.I append GCC4.4.0 libgcc2.h and patch 
file.Masahiro Ariga 


libgcc_files.tar.bz2
Description: Binary data