Re: Segmentation fault for the following Fortran program at -O3 on x86-64.

2010-08-06 Thread Toon Moene

Sebastian Pop wrote:


On Thu, Aug 5, 2010 at 15:17, Sebastian Pop  wrote:



On Thu, Aug 5, 2010 at 15:07, Sebastian Pop  wrote:



I'm delta reducing this.



Reduced it looks like this, and it seems like the bug is in the loop
distribution
for memset zero changes.

 parameter(numlev=3,numoblev=1000)
 integer i_otyp(numoblev,numlev), i_styp(numoblev,numlev)
 logical l_numob(numoblev,numlev)
 do ixe=1,numoblev
do iye=1,numlev
   i_otyp(ixe,iye)=0
   i_styp(ixe,iye)=0
   l_numob(ixe,iye)=.false.
enddo
 enddo
 do i=1,m
do j=1,n
   if (l_numob(i,j)) then
  write(20,'(7I4,F12.2,4F16.10)') i_otyp(i,j),i_styp(i,j)
   endif
enddo
 enddo
 end



This is now http://gcc.gnu.org/PR45199


Thanks for picking up the ball where I dropped it.  I was so tired 
yesterday that I couldn't wrap my head around reducing the example anymore.


Cheers,

--
Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290
Saturnushof 14, 3738 XG  Maartensdijk, The Netherlands
At home: http://moene.org/~toon/; weather: http://moene.org/~hirlam/
Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html#Fortran


Canadian cross build fails on 64 bits build machine

2010-08-06 Thread Christophe LYON

Hello,

I have noticed a build failure with GCC-4.5.0, when configuring with:
--build=x86_64-unknown-linux-gnu
--host=arm-none-linux-gnueabi
--target=arm-none-linux-gnueabi

The build fails when compiling gcc/genconstants.c for the build machine:
In file included from ../../gcc/rtl.h:28,
 from ../../gcc/genconstants.c:32:
../../gcc/real.h:84: error: size of array `test_real_width' is negative

I have looked a bit at real.h, and tried to compile with -m32, which works.

Now, I think it should also work without -m32.

From my brief investigation, I think that the problem is due to the 
fact that struct real_value uses the 'long' type for the 'sig' field, 
while the computation of REAL_WIDTH relies on HOST_BITS_PER_WIDE_INT.


Promoting 'sig' to type unsigned HOST_WIDE_INT makes the compilation pass.

Here is a naive patch proposal:
--- gcc-4.5.0/gcc/real.h2010-01-05 18:14:30.0 +0100
+++ gcc-4.5.0.patched/gcc/real.h2010-08-06 14:02:03.0 +0200
@@ -40,11 +40,11 @@ enum real_value_class {
   rvc_nan
 };

-#define SIGNIFICAND_BITS   (128 + HOST_BITS_PER_LONG)
+#define SIGNIFICAND_BITS   (128 + HOST_BITS_PER_WIDE_INT)
 #define EXP_BITS   (32 - 6)
 #define MAX_EXP((1 << (EXP_BITS - 1)) - 1)
-#define SIGSZ  (SIGNIFICAND_BITS / HOST_BITS_PER_LONG)
-#define SIG_MSB((unsigned long)1 << 
(HOST_BITS_PER_LONG - 1))

+#define SIGSZ  (SIGNIFICAND_BITS / HOST_BITS_PER_WIDE_INT)
+#define SIG_MSB((unsigned HOST_WIDE_INT)1 << 
(HOST_BITS_PER_WIDE_INT - 1))


 struct GTY(()) real_value {
   /* Use the same underlying type for all bit-fields, so as to make
@@ -56,7 +56,7 @@ struct GTY(()) real_value {
   unsigned int signalling : 1;
   unsigned int canonical : 1;
   unsigned int uexp : EXP_BITS;
-  unsigned long sig[SIGSZ];
+  unsigned HOST_WIDE_INT sig[SIGSZ];
 };


Is it OK?

Thanks,

Christophe.


Re: transitioning cloog to ppl-0.11

2010-08-06 Thread Jack Howarth
On Fri, Aug 06, 2010 at 07:39:49AM +0200, Ralf Wildenhues wrote:
> * Jack Howarth wrote on Fri, Aug 06, 2010 at 02:31:31AM CEST:
> >I have been mulling over how to transition the
> > current gcc4x and cloog packages in fink to the new
> > ppl-0.11 release and believe we really need to have
> > a soversion bump on cloog to safely do this on
> > systems with pre-existing packages.
> 
> Not sure I understand, but why should ppl soversion bump require a cloog
> soversion bump?  In which way is this different from any other library
> dependency that undergoes such a bump, and why should it be handled
> differently?
> 
> Thanks,
> Ralf

Ralf,
   My point was that in this case not only does ppl-0.11 require
the existing soversion of cloog to be rebuilt but also all other
previously built gcc releases that used it as well.
   Considering that the existing cloog-0.15.9 sources refuse
to build against ppl-0.11 due to the version check, it presents
an opportunity to do a soversion bump and a cloog-0.16 release
which would require ppl-0.11 as the minimum required ppl. I do
understand that this isn't the conventional reason to do a
soversion bump, but it does have the advantage of solving the
coherency problems in gcc/cloog/ppl builds. 
   Jack


Re: Canadian cross build fails on 64 bits build machine

2010-08-06 Thread Joseph S. Myers
On Fri, 6 Aug 2010, Christophe LYON wrote:

> From my brief investigation, I think that the problem is due to the fact that
> struct real_value uses the 'long' type for the 'sig' field, while the
> computation of REAL_WIDTH relies on HOST_BITS_PER_WIDE_INT.

No, this is not a problem; it's fine to use long in the representation but 
HOST_WIDE_INT when stored in an rtx.  The issue appears rather to be with

#define REAL_VALUE_TYPE_SIZE (SIGNIFICAND_BITS + 32)

where with 64-bit long there are going to be 32 bits of padding in this 
structure that are not allowed for.  Try changing that 32 to 
HOST_BITS_PER_LONG.

-- 
Joseph S. Myers
jos...@codesourcery.com


[plugin] compiling plugins with g++ and -pedantic gives getopt declaration error

2010-08-06 Thread Luke Dalessandro
Hi Everyone,

I have an Ubuntu 9.04 system that I've installed gcc-4.5.1 on, using an in-tree 
build of gmp, mpfr,  mpc, and libelf.

  lu...@node:~$ gcc-4.5.1 -v
  Using built-in specs.
  COLLECT_GCC=gcc-4.5.1
  
COLLECT_LTO_WRAPPER=/home/luked/local/libexec/gcc/x86_64-unknown-linux-gnu/4.5.1/lto-wrapper
  Target: x86_64-unknown-linux-gnu
  Configured with: ../configure --prefix=/home/luked/local --enable-lto 
--enable-plugin --enable-multilib --program-suffix=-4.5.1
  Thread model: posix
  gcc version 4.5.1 (GCC) 

I'm evaluating the plugin interface for an academic project, but I have the 
following compilation problem. The test I am using is:

  lu...@node:~$ cat test.cxx 
  extern "C" {
  #include "gcc-plugin.h"
  }

This compiles fine without "-pedantic":

  lu...@node:~$ g++-4.5.1 -I`g++-4.5.1 -print-file-name=plugin`/include -c 
test.cxx 

But adding "-pedantic" results in an error due to a function type mismatch for 
getopt.

  lu...@node:~$ g++-4.5.1 -I`g++-4.5.1 -print-file-name=plugin`/include 
-pedantic -c test.cxx 
  In file included from 
/home/luked/local/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/plugin/include/gcc-plugin.h:28:0,
 from test.cxx:2:
  
/home/luked/local/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/plugin/include/system.h:382:53:
 error: declaration of ‘int getopt(int, char* const*, const char*)’ throws 
different exceptions
  /usr/include/getopt.h:152:12: error: from previous declaration ‘int 
getopt(int, char* const*, const char*) throw ()’

The plugin/include/system.h does indeed declare getopt without the throw().

  luked@:~$ sed -n 380,384p 
/home/luked/local/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/plugin/include/system.h

  #if defined (HAVE_DECL_GETOPT) && !HAVE_DECL_GETOPT
  extern int getopt (int, char * const *, const char *);
  #endif

And my auto-conf.h declares HAVE_DECL_GETOPT 0.

  lu...@node:~$ sed -n 679,683p 
/home/luked/local/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/plugin/include/auto-host.h
  /* Define to 1 if we found a declaration for 'getopt', otherwise define to 0.
 */
  #ifndef USED_FOR_TARGET
  #define HAVE_DECL_GETOPT 0
  #endif

I don't know what "USED_FOR_TARGET" means, but it isn't defined anywhere inside 
of the plugin.

  lu...@node1x4x2a:~/local/lib/gcc/x86_64-unknown-linux-gnu/4.5.1/plugin$ grep 
-r USED_FOR_TARGET * | uniq
  include/config/i386/i386.h:#ifndef USED_FOR_TARGET
  include/auto-host.h:#ifndef USED_FOR_TARGET
  include/tm.h:#if defined IN_GCC && !defined GENERATOR_FILE && !defined 
USED_FOR_TARGET
  include/coretypes.h:#ifndef USED_FOR_TARGET

Does this seem like a configuration bug that I should report, or is it more 
likely that there's something wrong with my system?

Thanks,
Luke



Re: transitioning cloog to ppl-0.11

2010-08-06 Thread Jack Howarth
Ralf,
  Looking at Fedora 13 and Debian
unstable, I see that their gcc 4.4
compilers are using -ldl to avoid
an explicit linkage on libppl_c, libppl
and libcloog. However this still leaves
them open to a mismatch should they 
silently uprgrade libcloog from a
version built against ppl-0.10.2 to
one built against ppl-0.11. The gcc
build is loading the ppl headers via
the cloog headers so one ends up with
a gcc built against older ppl headers
that loads the libcloog.0.* built with
newer ppl headers.
Jack


Re: transitioning cloog to ppl-0.11

2010-08-06 Thread Richard Guenther
On Fri, Aug 6, 2010 at 4:49 PM, Jack Howarth  wrote:
> Ralf,
>  Looking at Fedora 13 and Debian
> unstable, I see that their gcc 4.4
> compilers are using -ldl to avoid
> an explicit linkage on libppl_c, libppl
> and libcloog. However this still leaves
> them open to a mismatch should they
> silently uprgrade libcloog from a
> version built against ppl-0.10.2 to
> one built against ppl-0.11. The gcc
> build is loading the ppl headers via
> the cloog headers so one ends up with
> a gcc built against older ppl headers
> that loads the libcloog.0.* built with
> newer ppl headers.

We can include a runtime version check.

Richard.

>                Jack
>


Re: Canadian cross build fails on 64 bits build machine

2010-08-06 Thread Christophe Lyon

On 06.08.2010 15:53, Joseph S. Myers wrote:

On Fri, 6 Aug 2010, Christophe LYON wrote:


 From my brief investigation, I think that the problem is due to the fact that
struct real_value uses the 'long' type for the 'sig' field, while the
computation of REAL_WIDTH relies on HOST_BITS_PER_WIDE_INT.


No, this is not a problem; it's fine to use long in the representation but
HOST_WIDE_INT when stored in an rtx.  The issue appears rather to be with

#define REAL_VALUE_TYPE_SIZE (SIGNIFICAND_BITS + 32)

where with 64-bit long there are going to be 32 bits of padding in this
structure that are not allowed for.  Try changing that 32 to
HOST_BITS_PER_LONG.



I does not fix my problem: HOST_BITS_PER_LONG is still 32. Remember, my 
host is ARM, my target is ARM, but my build machine is x86_64, which 
makes the 'sig' field of 'real_value' an array of 5 * 64 bits, while 
SIGSZ, SIGNIFICAND_BITS and REAL_VALUE_TYPE_SIZE are all defined wrt to 
HOST_BITS_PER_LONG, which is 32.


Christophe.



Re: transitioning cloog to ppl-0.11

2010-08-06 Thread Jack Howarth
On Fri, Aug 06, 2010 at 05:05:19PM +0200, Richard Guenther wrote:
> On Fri, Aug 6, 2010 at 4:49 PM, Jack Howarth  wrote:
> > Ralf,
> >  Looking at Fedora 13 and Debian
> > unstable, I see that their gcc 4.4
> > compilers are using -ldl to avoid
> > an explicit linkage on libppl_c, libppl
> > and libcloog. However this still leaves
> > them open to a mismatch should they
> > silently uprgrade libcloog from a
> > version built against ppl-0.10.2 to
> > one built against ppl-0.11. The gcc
> > build is loading the ppl headers via
> > the cloog headers so one ends up with
> > a gcc built against older ppl headers
> > that loads the libcloog.0.* built with
> > newer ppl headers.
> 
> We can include a runtime version check.
> 
> Richard.

Richard,
   Which libppl would get checked? If gcc
loads libppl via -ldl and libcloog also
is directly linked to libppl, would a version
check from gcc be looking at the dl loaded
libppl or the one linked to libcloog? Or
would this do a direct version check with
ppl in gcc and compare that to the answer
from a version check done through cloog's
interfaces?
   Jack


Re: Canadian cross build fails on 64 bits build machine

2010-08-06 Thread Joseph S. Myers
On Fri, 6 Aug 2010, Christophe Lyon wrote:

> On 06.08.2010 15:53, Joseph S. Myers wrote:
> > On Fri, 6 Aug 2010, Christophe LYON wrote:
> > 
> > >  From my brief investigation, I think that the problem is due to the fact
> > > that
> > > struct real_value uses the 'long' type for the 'sig' field, while the
> > > computation of REAL_WIDTH relies on HOST_BITS_PER_WIDE_INT.
> > 
> > No, this is not a problem; it's fine to use long in the representation but
> > HOST_WIDE_INT when stored in an rtx.  The issue appears rather to be with
> > 
> > #define REAL_VALUE_TYPE_SIZE (SIGNIFICAND_BITS + 32)
> > 
> > where with 64-bit long there are going to be 32 bits of padding in this
> > structure that are not allowed for.  Try changing that 32 to
> > HOST_BITS_PER_LONG.
> > 
> 
> I does not fix my problem: HOST_BITS_PER_LONG is still 32. Remember, my host
> is ARM, my target is ARM, but my build machine is x86_64, which makes the
> 'sig' field of 'real_value' an array of 5 * 64 bits, while SIGSZ,
> SIGNIFICAND_BITS and REAL_VALUE_TYPE_SIZE are all defined wrt to
> HOST_BITS_PER_LONG, which is 32.

Then the problem is that this structure is being defined for the build 
system using macros whose values are set of the host; logically, when 
compiling this code for the build system, HOST_* should have definitions 
that are correct for the build system rather than for the host, but it is 
also possible that certain structures do not need to be defined on the 
build system at all.  You will need to gain a full understanding of how 
the definitions are used on the build system and whether build/host 
differences will cause any issues for generated files used in the build in 
order to work out a proper fix.

There are many places in the compiler that hardcode the use of "long" in 
this structure; you cannot change that type safely with a local patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Bruce Korb
The problem seems to be that GDB thinks all the code belongs to a
single line of text.  At first, it was a file of mine, so I presumed
I had done something strange and passed it off.  I needed to do some
more debugging again and my "-g -O0" output still said all code
belonged to that one line.  So, I made a .i file and compiled that.
Different file, but the same problem.  The .i file contains the
correct preprocessor directives:

  # 309 "wrapup.c"
  static void
  done_check(void)
  {

but under gdb:

  (gdb) b done_check
  Breakpoint 5 at 0x40af44: file /usr/include/gmp.h, line 1661.

the break point *is* on the entry to "done_check", but the
source code displayed is line 1661 of gmp.h.  Not helpful.
Further, I cannot set break points on line numbers because
all code belongs to the one line in gmp.h.

Yes, for now I can debug in assembly code, but it isn't very easy.

$ gcc --version
gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I've googled for:  gcc|gdb wrong source file
which only yields how to examine source files in gdb.


Re: transitioning cloog to ppl-0.11

2010-08-06 Thread Ralf Wildenhues
* Jack Howarth wrote on Fri, Aug 06, 2010 at 03:01:16PM CEST:
>My point was that in this case not only does ppl-0.11 require
> the existing soversion of cloog to be rebuilt but also all other
> previously built gcc releases that used it as well.
>Considering that the existing cloog-0.15.9 sources refuse
> to build against ppl-0.11 due to the version check,

Does cloog-0.15.9 check for the exact version, rather than using
something like AS_VERSION_COMPARE and checking for a minimum version?
Has that been addressed in the cloog sources since?

> it presents
> an opportunity to do a soversion bump and a cloog-0.16 release
> which would require ppl-0.11 as the minimum required ppl. I do
> understand that this isn't the conventional reason to do a
> soversion bump, but it does have the advantage of solving the
> coherency problems in gcc/cloog/ppl builds. 

I don't mind, but it's a good idea to ensure that for the next
major version bump of ppl things are prepared to (hopefully)
not require cascades of major bumps more than necessary.

Cheers,
Ralf


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread David Daney

On 08/06/2010 10:19 AM, Bruce Korb wrote:

The problem seems to be that GDB thinks all the code belongs to a
single line of text.  At first, it was a file of mine, so I presumed
I had done something strange and passed it off.  I needed to do some
more debugging again and my "-g -O0" output still said all code
belonged to that one line.  So, I made a .i file and compiled that.
Different file, but the same problem.  The .i file contains the
correct preprocessor directives:

   # 309 "wrapup.c"
   static void
   done_check(void)
   {

but under gdb:

   (gdb) b done_check
   Breakpoint 5 at 0x40af44: file /usr/include/gmp.h, line 1661.

the break point *is* on the entry to "done_check", but the
source code displayed is line 1661 of gmp.h.  Not helpful.
Further, I cannot set break points on line numbers because
all code belongs to the one line in gmp.h.

Yes, for now I can debug in assembly code, but it isn't very easy.

$ gcc --version
gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I've googled for:  gcc|gdb wrong source file
which only yields how to examine source files in gdb.



Which version of GDB?

IIRC with GCC-4.5 you need a very new version of GDB.  This page:

http://gcc.gnu.org/gcc-4.5/changes.html

indicates that GDB 7.0 or later would be good candidates.

David Daney.


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Bruce Korb
On 08/06/10 10:19, Bruce Korb wrote:
> The problem seems to be that GDB thinks all the code belongs to a
> single line of text.  At first, it was a file of mine, so I presumed
> I had done something strange and passed it off.  I needed to do some
> more debugging again and my "-g -O0" output still said all code
> belonged to that one line.  So, I made a .i file and compiled that.
> Different file, but the same problem.  The .i file contains the
> correct preprocessor directives:

Followup:  I stripped all blank lines and preprocessor directives from
the .i file.

(gdb) b main
Breakpoint 8 at 0x40ab35: file ag.i, line 2900.
(gdb) b inner_main
Breakpoint 9 at 0x40aa7a: file ag.i, line 2900.
2898__gmpz_fits_uint_p (mpz_srcptr __gmp_z)
-   2899{
-   2900  mp_size_t __gmp_n = __gmp_z->_mp_size; mp_ptr __gmp_p = 
__gmp_z->_mp_d; return (__gmp_n == 0 || (__gmp_n == 1 && __gmp_p[0] <= (~ 
(unsigned) 0)));;
2901}

There are 18,000 lines in this file, so it isn't just the end.
Would someone like the 18,000 line file, or is this a known problem
that can be found with a different google expression?


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Bruce Korb
On 08/06/10 10:24, David Daney wrote:
> On 08/06/2010 10:19 AM, Bruce Korb wrote:
>> The problem seems to be that GDB thinks all the code belongs to a
>> single line of text.  At first, it was a file of mine, so I presumed
>> I had done something strange and passed it off.  I needed to do some
>> more debugging again and my "-g -O0" output still said all code
>> belonged to that one line.  So, I made a .i file and compiled that.
>> Different file, but the same problem.  The .i file contains the
>> correct preprocessor directives:
>>
>># 309 "wrapup.c"
>>static void
>>done_check(void)
>>{
>>
>> but under gdb:
>>
>>(gdb) b done_check
>>Breakpoint 5 at 0x40af44: file /usr/include/gmp.h, line 1661.
>>
>> the break point *is* on the entry to "done_check", but the
>> source code displayed is line 1661 of gmp.h.  Not helpful.
>> Further, I cannot set break points on line numbers because
>> all code belongs to the one line in gmp.h.
>>
>> Yes, for now I can debug in assembly code, but it isn't very easy.
>>
>> $ gcc --version
>> gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]
>> Copyright (C) 2010 Free Software Foundation, Inc.
>> This is free software; see the source for copying conditions.  There
>> is NO
>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
>> PURPOSE.
>>
>> I've googled for:  gcc|gdb wrong source file
>> which only yields how to examine source files in gdb.
>>
> 
> Which version of GDB?
> 
> IIRC with GCC-4.5 you need a very new version of GDB.  This page:
> 
> http://gcc.gnu.org/gcc-4.5/changes.html
> 
> indicates that GDB 7.0 or later would be good candidates.

That seems to work.  There are one or two or three bugs then.
Either gdb needs to recognize an out of sync object code, or else
gcc needs to produce object code that forces gdb to object in a way
more obvious than just deciding upon the wrong file and line --
or both.  I simply installed the latest openSuSE and got whatever
was supplied.  It isn't reasonable to expect folks to go traipsing
through upstream web sites looking for "changes.html" files 

And, of course, the insight stuff needs to incorporate the latest
and greatest gdb.  (I don't use ddd because it is _completely_ non-
intuitive.)


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread David Daney

On 08/06/2010 10:51 AM, Bruce Korb wrote:

On 08/06/10 10:24, David Daney wrote:

On 08/06/2010 10:19 AM, Bruce Korb wrote:

The problem seems to be that GDB thinks all the code belongs to a
single line of text.  At first, it was a file of mine, so I presumed
I had done something strange and passed it off.  I needed to do some
more debugging again and my "-g -O0" output still said all code
belonged to that one line.  So, I made a .i file and compiled that.
Different file, but the same problem.  The .i file contains the
correct preprocessor directives:

# 309 "wrapup.c"
static void
done_check(void)
{

but under gdb:

(gdb) b done_check
Breakpoint 5 at 0x40af44: file /usr/include/gmp.h, line 1661.

the break point *is* on the entry to "done_check", but the
source code displayed is line 1661 of gmp.h.  Not helpful.
Further, I cannot set break points on line numbers because
all code belongs to the one line in gmp.h.

Yes, for now I can debug in assembly code, but it isn't very easy.

$ gcc --version
gcc (SUSE Linux) 4.5.0 20100604 [gcc-4_5-branch revision 160292]
Copyright (C) 2010 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There
is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.

I've googled for:  gcc|gdb wrong source file
which only yields how to examine source files in gdb.



Which version of GDB?

IIRC with GCC-4.5 you need a very new version of GDB.  This page:

http://gcc.gnu.org/gcc-4.5/changes.html

indicates that GDB 7.0 or later would be good candidates.


That seems to work.  There are one or two or three bugs then.
Either gdb needs to recognize an out of sync object code


It cannot do this as it was released before GCC-4.5.


, or else
gcc needs to produce object code that forces gdb to object in a way
more obvious than just deciding upon the wrong file and line --
or both.  I simply installed the latest openSuSE and got whatever
was supplied.  It isn't reasonable to expect folks to go traipsing
through upstream web sites looking for "changes.html" files 

And, of course, the insight stuff needs to incorporate the latest
and greatest gdb.  (I don't use ddd because it is _completely_ non-
intuitive.)



My understanding is that whoever packages GCC and GDB for a particular 
distribution is responsible to make sure that they work together.


In your case it looks like that didn't happen.

David Daney




Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Bruce Korb
On Fri, Aug 6, 2010 at 11:19 AM, David Daney  wrote:
>> That seems to work.  There are one or two or three bugs then.
>> Either gdb needs to recognize an out of sync object code
>
> It cannot do this as it was released before GCC-4.5.

GDB and GCC communicate with each other with particular conventions.
Conventions will change over time.  GCC cannot really know which debugger
is going to be used, so it just emits its code and debug information.
GDB, on the other hand, needs to know what conventions were used when
the binaries were produced.  If it cannot tell, it is a GCC issue _and_ a
GDB issue.  If it can tell and chooses to indicate the problem by supplying
bogus responses, then it is solely a GDB bug.  Either way, we have a bug.

>> And, of course, the insight stuff needs to incorporate the latest
>> and greatest gdb.  (I don't use ddd because it is _completely_ non-
>> intuitive.)
>
> My understanding is that whoever packages GCC and GDB for a particular
> distribution is responsible to make sure that they work together.
>
> In your case it looks like that didn't happen.

openSuSE seems to think that ddd is an adequate debugger.  I do not.
I use insight.  There is no automatic update to insight and insight does
not currently have a 7.x release of the underlying GDB.  They are tightly
bound.  :(

Anyway, I now know what the problem is and I am anxiously awaiting a
new release of Insight -- and I recommend some protocol versioning fixes
for GDB and, possibly, GCC too.


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Tom Tromey
> "Bruce" == Bruce Korb  writes:

Bruce> That seems to work.  There are one or two or three bugs then.
Bruce> Either gdb needs to recognize an out of sync object code, or else
Bruce> gcc needs to produce object code that forces gdb to object in a way
Bruce> more obvious than just deciding upon the wrong file and line --
Bruce> or both.

Nothing can be done about old versions of gdb.  They are fixed.

I think the situation is better in newer versions of GDB.  We've fixed a
lot of bugs, anyway.  (I'm not sure exactly what problem you hit, so I
don't know if gdb is in fact any more future-proof in that area.)

I don't think things can ever be perfect.  GDB checks the various DWARF
version numbers, but that doesn't exclude extensions.

Bruce> I simply installed the latest openSuSE and got whatever was
Bruce> supplied.  It isn't reasonable to expect folks to go traipsing
Bruce> through upstream web sites looking for "changes.html" files 

In a situation like this, I suggest complaining to your vendor.  We've
done a lot of work in GDB to catch up with GCC's changing output.  The
development process here is actually reasonably well synchronized.

Tom


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Richard Guenther
On Fri, Aug 6, 2010 at 8:41 PM, Tom Tromey  wrote:
>> "Bruce" == Bruce Korb  writes:
>
> Bruce> That seems to work.  There are one or two or three bugs then.
> Bruce> Either gdb needs to recognize an out of sync object code, or else
> Bruce> gcc needs to produce object code that forces gdb to object in a way
> Bruce> more obvious than just deciding upon the wrong file and line --
> Bruce> or both.
>
> Nothing can be done about old versions of gdb.  They are fixed.
>
> I think the situation is better in newer versions of GDB.  We've fixed a
> lot of bugs, anyway.  (I'm not sure exactly what problem you hit, so I
> don't know if gdb is in fact any more future-proof in that area.)
>
> I don't think things can ever be perfect.  GDB checks the various DWARF
> version numbers, but that doesn't exclude extensions.
>
> Bruce> I simply installed the latest openSuSE and got whatever was
> Bruce> supplied.  It isn't reasonable to expect folks to go traipsing
> Bruce> through upstream web sites looking for "changes.html" files 
>
> In a situation like this, I suggest complaining to your vendor.  We've
> done a lot of work in GDB to catch up with GCC's changing output.  The
> development process here is actually reasonably well synchronized.

The gdb version on openSUSE that ship with GCC 4.5 is perfectly fine
(it's 7.1 based).  No idea what the reporter is talking about (we don't ship
insight IIRC).

Richard.

> Tom
>


Re: Bizarre GCC problem - how do I debug it?

2010-08-06 Thread Bruce Korb
Hi Richard,

On Fri, Aug 6, 2010 at 11:43 AM, Richard Guenther
 wrote:
> The gdb version on openSUSE that ship with GCC 4.5 is perfectly fine
> (it's 7.1 based).  No idea what the reporter is talking about (we don't ship
> insight IIRC).

You are remembering correctly.  I was not clear enough.  I use Insight
and Insight
is tightly bound to a particular version of GDB.  Since Insight is not
distributed or
supported by openSuSE, this is not an openSuSE issue.  This is an Insight issue
(for having fallen behind, though it is understandable...) and it is a
GDB (and? GCC)
issue *because* the _failure_mode_is_too_confusing_.  GDB/GCC should
be coordinating
so that GDB looks at the binary and says, "I do not understand the
debug information".
Instead, GDB believes that there are no newline characters in the
input.  Is that a
GDB issue or a GCC issue?  I cannot say.  What I can say is that the
hapless user
should be able to read an error message and know what the problem is.

This does not tell me:

>  (gdb) b done_check
>  Breakpoint 5 at 0x40af44: file /usr/include/gmp.h, line 1661.

Thank you everyone.  Regards, Bruce


GCC Binary

2010-08-06 Thread Erick Garske
There a location where I can download the binary of GCC for the IBM i?

http://gcc.gnu.org/install/binaries.html

Are any of these compatible for the IBM i at V6R1M0?

Thanks,
Erick


Re: GCC Binary

2010-08-06 Thread Peter Bergner
On Fri, 2010-08-06 at 12:27 -0700, Erick Garske wrote:
> There a location where I can download the binary of GCC for the IBM i?
> 
> http://gcc.gnu.org/install/binaries.html
> 
> Are any of these compatible for the IBM i at V6R1M0?

There is no support in GCC for native iSeries (AKA AS/400).


Peter





Re: GCC Binary

2010-08-06 Thread Kevin Bowling
On Fri, Aug 6, 2010 at 1:28 PM, Peter Bergner  wrote:
> On Fri, 2010-08-06 at 12:27 -0700, Erick Garske wrote:
>> There a location where I can download the binary of GCC for the IBM i?
>>
>> http://gcc.gnu.org/install/binaries.html
>>
>> Are any of these compatible for the IBM i at V6R1M0?
>
> There is no support in GCC for native iSeries (AKA AS/400).
>

I don't know if they kept it around for V6+, but under PASE AIX
binaries are supposed to function as-is.  Whether that means you will
need to compile on an AIX LPAR or can self-host GCC under PASE is
worth testing.


kernel BUG at fs/ext4/mballoc.c:2993!

2010-08-06 Thread Justin Mattock
hello,
I just built a fresh clfs system using the tutorial.. right now Im
able to boot and am able to login, the system seems to be running as
it should except for when I try to install gmp and/or do a /sbin/lilo
I see a message appear on screen(below) then if I do any kind of
command(dmesg > dmesg) I get a stuck screen. has there been anything
similar to the below message?

keep in mind the kernel I'm using is 2.6.35-rc6 which on other
machines(same type of system) run just fine without such message.

only real thing different that I did with this build was build the
latest gcc with gmp/mpfr/mpc inside gcc source directory instead of
installing them on the system then using the switches to there
location.



<0>[   48.976957] [ cut here ]
<2>[   48.977187] kernel BUG at fs/ext4/mballoc.c:2993!
<0>[   48.977415] invalid opcode:  [#1] SMP

<0>[   48.977694] last sysfs file: /sys/devices/virtual/vc/vcsa12/uevent
<4>[   48.977873] CPU 0
<4>[   48.977873] Modules linked in: uvcvideo videodev v4l1_compat
firewire_ohci firewire_core ohci1394 i2c_nforce2 ohci_hcd forcedeth
evdev thermal button aes_x86_64 lzo lzo_decompress lzo_compress tun
kvm_intel ipcomp xfrm_ipcomp crypto_null sha256_generic cbc
des_generic cast5 blowfish serpent camellia twofish twofish_common ctr
ah4 esp4 authenc adm1021 raw1394 ieee1394 uhci_hcd ehci_hcd hci_uart
rfcomm btusb hidp l2cap bluetooth coretemp acpi_cpufreq processor
mperf appletouch applesmc
<4>[   48.977873]
<4>[   48.977873] Pid: 1482, comm: lilo Not tainted 2.6.35-rc6 #1
Mac-F2218FC8/iMac9,1
<4>[   48.977873] RIP: 0010:[]  []
ext4_mb_normalize_request+0x2d3/0x342
<4>[   48.977873] RSP: 0018:880137a6fa88  EFLAGS: 00010206
<4>[   48.977873] RAX: 88013eef RBX: 880138ee5000 RCX:
0010
<4>[   48.977873] RDX: 0010 RSI: 0010 RDI:
88013eee1568
<4>[   48.977873] RBP: 880137a6fad8 R08: 0001fff0 R09:
880137a6fb08
<4>[   48.977873] R10: 00016e10 R11: 880137a6fc30 R12:
0010
<4>[   48.977873] R13: 880137a6fc10 R14: 0001fff0 R15:
0002
<4>[   48.977873] FS:  7f58b5b65700()
GS:880001a0() knlGS:
<4>[   48.977873] CS:  0010 DS:  ES:  CR0: 8005003b
<4>[   48.977873] CR2: 00669018 CR3: 000138463000 CR4:
000406f0
<4>[   48.977873] DR0:  DR1:  DR2:

<4>[   48.977873] DR3:  DR6: 0ff0 DR7:
0400
<4>[   48.977873] Process lilo (pid: 1482, threadinfo
880137a6e000, task 880137310f60)
<0>[   48.977873] Stack:
<4>[   48.977873]   8050 880138faaca8
000281150729
<4>[   48.977873] <0> 880137a6fb08 880137a6fc10
880137a6fc64 88013eee1568
<4>[   48.977873] <0> 880138ee5000 
880137a6fb58 81154ca0
<0>[   48.977873] Call Trace:
<4>[   48.977873]  [] ext4_mb_new_blocks+0x173/0x3d3
<4>[   48.977873]  [] ? ext4_ext_find_extent+0x45/0x2a6
<4>[   48.977873]  [] ext4_ext_map_blocks+0x1732/0x1aeb
<4>[   48.977873]  [] ?
radix_tree_gang_lookup_tag_slot+0x81/0xa2
<4>[   48.977873]  [] ? pagevec_lookup_tag+0x20/0x29
<4>[   48.977873]  [] ext4_map_blocks+0x115/0x1f4
<4>[   48.977873]  [] mpage_da_map_blocks+0xeb/0x364
<4>[   48.977873]  [] ? ext4_journal_start_sb+0xc7/0x103
<4>[   48.977873]  [] ext4_da_writepages+0x330/0x579
<4>[   48.977873]  [] ? mutex_unlock+0x9/0xb
<4>[   48.977873]  [] ? generic_file_aio_write+0x84/0xa4
<4>[   48.977873]  [] do_writepages+0x1f/0x28
<4>[   48.977873]  [] __filemap_fdatawrite_range+0x4e/0x50
<4>[   48.977873]  [] filemap_write_and_wait_range+0x28/0x51
<4>[   48.977873]  [] vfs_fsync_range+0x36/0x79
<4>[   48.977873]  [] vfs_fsync+0x17/0x19
<4>[   48.977873]  [] do_fsync+0x29/0x3e
<4>[   48.977873]  [] sys_fdatasync+0xe/0x12
<4>[   48.977873]  [] system_call_fastpath+0x16/0x1b
<0>[   48.977873] Code: 44 8b 45 b8 8b 43 10 89 c2 49 39 d7 7f 07 41
39 c4 76 02 0f 0b 4d 85 f6 74 11 48 8b 7b 08 48 8b 87 28 03 00 00 4c
3b 70 10 76 02 <0f> 0b 44 89 63 20 44 89 43 2c 49 8b 75 28 48 85 f6 74
1f 41 8b
<1>[   48.977873] RIP  []
ext4_mb_normalize_request+0x2d3/0x342
<4>[   48.977873]  RSP 
<4>[   48.994547] ---[ end trace 5f3a007a6b3c50ca ]---




-- 
Justin P. Mattock


Re: kernel BUG at fs/ext4/mballoc.c:2993!

2010-08-06 Thread Ted Ts'o
On Fri, Aug 06, 2010 at 10:48:40PM -0700, Justin Mattock wrote:
> hello,
> I just built a fresh clfs system using the tutorial.. right now Im
> able to boot and am able to login, the system seems to be running as
> it should except for when I try to install gmp and/or do a /sbin/lilo
> I see a message appear on screen(below) then if I do any kind of
> command(dmesg > dmesg) I get a stuck screen. has there been anything
> similar to the below message?
> 
> keep in mind the kernel I'm using is 2.6.35-rc6 which on other
> machines(same type of system) run just fine without such message.

Um, is this a completely modified 2.6.35-rc6 kernel?  The reason why I
ask is there is no BUG_ON at line fs/ext4/mballoc.c:2993 for that
kernel version.

There are two BUG_ON statements nearby, but given the line number
doesn't match up with either one, it's hard to say for sure which one
triggered it.  What were the kernel messages right before the BUG_ON?
was there a "start N size NNN, fe_logical " (where  is
some number) right before the "cut here" message?

Have you tried forcing an fsck run on the file system to make sure
it's not caused by a file-system corruption?

And have you tried using a standard released gcc so we can determine
for sure whether this is a potential kernel bug, file system
corruption issue, or gcc issue?

- Ted