I don't see any line numbers on the errors I flagged - all I see are the usual memory offsets in bytes, which is of little help. I'm afraid I don't what what you'd have to do under SunOS to get line numbers, but I can't do much without it
On Sep 2, 2014, at 10:26 AM, Siegmar Gross <siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi Ralph, > >> Could you please configure this OMPI install with --enable-debug >> so that gdb will provide line numbers where the error is occurring? >> Otherwise, I'm having a hard time chasing this problem down. > > I always configure with "--enable-debug" and I used the following > command. I my original email I have had a backtrace with line > numbers for both my C and Java problems. > > tyr openmpi-1.9a1r32657-SunOS.sparc.64_cc 119 head config.log > This file contains any messages produced by compilers while > running configure, to aid debugging if configure makes a mistake. > > It was created by Open MPI configure 1.9a1, which was > generated by GNU Autoconf 2.69. Invocation command line was > > $ ../openmpi-1.9a1r32657/configure --prefix=/usr/local/openmpi-1.9_64_cc > --libdir=/usr/local/openmpi-1.9_64_cc/lib64 > --with-jdk-bindir=/usr/local/jdk1.8.0/bin > --with-jdk-headers=/usr/local/jdk1.8.0/include JAVA_HOME=/usr/local/jdk1.8.0 > LDFLAGS=-m64 CC=cc CXX=CC FC=f95 CFLAGS=-m64 CXXFLAGS=-m64 -library=stlport4 > FCFLAGS=-m64 CPP=cpp CXXCPP=cpp CPPFLAGS= CXXCPPFLAGS= --enable-mpi-cxx > --enable-cxx-exceptions --enable-mpi-java --enable-heterogeneous > --enable-mpi-thread-multiple --with-threads=posix --with-hwloc=internal > --without-verbs --with-wrapper-cflags=-m64 --enable-debug > > > What can I do to provide line numbers for the "mca_oob_tcp_accept: > accept() failed" error? > > Kind regards > > Siegmar > > >> On Sep 2, 2014, at 6:01 AM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: >> >>> C problem: >>> ========== >>> >>> tyr small_prog 111 mpiexec -np 1 --host linpc0 init_finalize >>> [tyr.informatik.hs-fulda.de:00593] mca_oob_tcp_accept: accept() failed: > Error 0 (11). >>> Hello! >>> >>> tyr small_prog 112 mpiexec -np 1 --host sunpc0 init_finalize >>> [tyr.informatik.hs-fulda.de:00597] mca_oob_tcp_accept: accept() failed: > Error 0 (11). >>> Hello! >>> >>> tyr small_prog 113 mpiexec -np 1 --host tyr init_finalize >>> [tyr:00606] *** Process received signal *** >>> [tyr:00606] Signal: Bus Error (10) >>> [tyr:00606] Signal code: Invalid address alignment (1) >>> [tyr:00606] Failing at address: ffffffff7fffd7fc >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_back > trace_print+0x1c >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:0x1a4960 >>> /lib/sparcv9/libc.so.1:0xd8b98 >>> /lib/sparcv9/libc.so.1:0xcc70c >>> /lib/sparcv9/libc.so.1:0xcc918 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack_int64+0xf4 [ Signal >>> 2096416616 (?)] >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack_buffer+0x168 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack+0x24c >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/openmpi/mca_pmix_native.so:0x1 > 4e10 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:ompi_mpi_init+ > 0xd18 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:MPI_Init+0x26c >>> /home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10 >>> /home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x12c >>> [tyr:00606] *** End of error message *** >>> -------------------------------------------------------------------------- >>> mpiexec noticed that process rank 0 with PID 606 on node tyr exited on > signal 10 (Bus Error). >>> -------------------------------------------------------------------------- >>> tyr small_prog 114 >>> >>> >>> >>> gdb shows the following backtrace. >>> >>> tyr small_prog 115 /usr/local/gdb-7.6.1_64_gcc/bin/gdb > /usr/local/openmpi-1.9_64_cc/bin/mpiexec >>> GNU gdb (GDB) 7.6.1 >>> Copyright (C) 2013 Free Software Foundation, Inc. >>> License GPLv3+: GNU GPL version 3 or later > <http://gnu.org/licenses/gpl.html> >>> This is free software: you are free to change and redistribute it. >>> There is NO WARRANTY, to the extent permitted by law. Type "show copying" >>> and "show warranty" for details. >>> This GDB was configured as "sparc-sun-solaris2.10". >>> For bug reporting instructions, please see: >>> <http://www.gnu.org/software/gdb/bugs/>... >>> Reading symbols from > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/bin/orterun...done. >>> (gdb) run -np 1 --host tyr init_finalize >>> Starting program: /usr/local/openmpi-1.9_64_cc/bin/mpiexec -np 1 --host tyr > init_finalize >>> [Thread debugging using libthread_db enabled] >>> [New Thread 1 (LWP 1)] >>> [New LWP 2 ] >>> [tyr:00628] *** Process received signal *** >>> [tyr:00628] Signal: Bus Error (10) >>> [tyr:00628] Signal code: Invalid address alignment (1) >>> [tyr:00628] Failing at address: ffffffff7fffd73c >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_back > trace_print+0x1c >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:0x1a4960 >>> /lib/sparcv9/libc.so.1:0xd8b98 >>> /lib/sparcv9/libc.so.1:0xcc70c >>> /lib/sparcv9/libc.so.1:0xcc918 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack_int64+0xf4 [ Signal >>> 2096416616 (?)] >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack_buffer+0x168 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_ > unpack+0x24c >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/openmpi/mca_pmix_native.so:0x1 > 4e10 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:ompi_mpi_init+ > 0xd18 >>> > /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:MPI_Init+0x26c >>> /home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10 >>> /home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x12c >>> [tyr:00628] *** End of error message *** >>> -------------------------------------------------------------------------- >>> mpiexec noticed that process rank 0 with PID 628 on node tyr exited on > signal 10 (Bus Error). >>> -------------------------------------------------------------------------- >>> [ >> >