I don't see any line numbers on the errors I flagged - all I see are the usual 
memory offsets in bytes, which is of little help. I'm afraid I don't what what 
you'd have to do under SunOS to get line numbers, but I can't do much without it


On Sep 2, 2014, at 10:26 AM, Siegmar Gross 
<siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi Ralph,
> 
>> Could you please configure this OMPI install with --enable-debug
>> so that gdb will provide line numbers where the error is occurring?
>> Otherwise, I'm having a hard time chasing this problem down.
> 
> I always configure with "--enable-debug" and I used the following
> command. I my original email I have had a backtrace with line
> numbers for both my C and Java problems.
> 
> tyr openmpi-1.9a1r32657-SunOS.sparc.64_cc 119 head config.log
> This file contains any messages produced by compilers while
> running configure, to aid debugging if configure makes a mistake.
> 
> It was created by Open MPI configure 1.9a1, which was
> generated by GNU Autoconf 2.69.  Invocation command line was
> 
>  $ ../openmpi-1.9a1r32657/configure --prefix=/usr/local/openmpi-1.9_64_cc 
> --libdir=/usr/local/openmpi-1.9_64_cc/lib64 
> --with-jdk-bindir=/usr/local/jdk1.8.0/bin 
> --with-jdk-headers=/usr/local/jdk1.8.0/include JAVA_HOME=/usr/local/jdk1.8.0 
> LDFLAGS=-m64 CC=cc CXX=CC FC=f95 CFLAGS=-m64 CXXFLAGS=-m64 -library=stlport4 
> FCFLAGS=-m64 CPP=cpp CXXCPP=cpp CPPFLAGS= CXXCPPFLAGS= --enable-mpi-cxx 
> --enable-cxx-exceptions --enable-mpi-java --enable-heterogeneous 
> --enable-mpi-thread-multiple --with-threads=posix --with-hwloc=internal 
> --without-verbs --with-wrapper-cflags=-m64 --enable-debug
> 
> 
> What can I do to provide line numbers for the "mca_oob_tcp_accept:
> accept() failed" error?
> 
> Kind regards
> 
> Siegmar
> 
> 
>> On Sep 2, 2014, at 6:01 AM, Siegmar Gross 
> <siegmar.gr...@informatik.hs-fulda.de> wrote:
>> 
>>> C problem:
>>> ==========
>>> 
>>> tyr small_prog 111 mpiexec -np 1 --host linpc0 init_finalize
>>> [tyr.informatik.hs-fulda.de:00593] mca_oob_tcp_accept: accept() failed: 
> Error 0 (11).
>>> Hello!
>>> 
>>> tyr small_prog 112 mpiexec -np 1 --host sunpc0 init_finalize
>>> [tyr.informatik.hs-fulda.de:00597] mca_oob_tcp_accept: accept() failed: 
> Error 0 (11).
>>> Hello!
>>> 
>>> tyr small_prog 113 mpiexec -np 1 --host tyr init_finalize
>>> [tyr:00606] *** Process received signal ***
>>> [tyr:00606] Signal: Bus Error (10)
>>> [tyr:00606] Signal code: Invalid address alignment (1)
>>> [tyr:00606] Failing at address: ffffffff7fffd7fc
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_back
> trace_print+0x1c
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:0x1a4960
>>> /lib/sparcv9/libc.so.1:0xd8b98
>>> /lib/sparcv9/libc.so.1:0xcc70c
>>> /lib/sparcv9/libc.so.1:0xcc918
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack_int64+0xf4 [ Signal 
>>> 2096416616 (?)]
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack_buffer+0x168
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack+0x24c
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/openmpi/mca_pmix_native.so:0x1
> 4e10
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:ompi_mpi_init+
> 0xd18
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:MPI_Init+0x26c
>>> /home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10
>>> /home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x12c
>>> [tyr:00606] *** End of error message ***
>>> --------------------------------------------------------------------------
>>> mpiexec noticed that process rank 0 with PID 606 on node tyr exited on 
> signal 10 (Bus Error).
>>> --------------------------------------------------------------------------
>>> tyr small_prog 114 
>>> 
>>> 
>>> 
>>> gdb shows the following backtrace.
>>> 
>>> tyr small_prog 115 /usr/local/gdb-7.6.1_64_gcc/bin/gdb 
> /usr/local/openmpi-1.9_64_cc/bin/mpiexec 
>>> GNU gdb (GDB) 7.6.1
>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>> License GPLv3+: GNU GPL version 3 or later 
> <http://gnu.org/licenses/gpl.html>
>>> This is free software: you are free to change and redistribute it.
>>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>>> and "show warranty" for details.
>>> This GDB was configured as "sparc-sun-solaris2.10".
>>> For bug reporting instructions, please see:
>>> <http://www.gnu.org/software/gdb/bugs/>...
>>> Reading symbols from 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/bin/orterun...done.
>>> (gdb) run -np 1 --host tyr init_finalize   
>>> Starting program: /usr/local/openmpi-1.9_64_cc/bin/mpiexec -np 1 --host tyr 
> init_finalize
>>> [Thread debugging using libthread_db enabled]
>>> [New Thread 1 (LWP 1)]
>>> [New LWP    2        ]
>>> [tyr:00628] *** Process received signal ***
>>> [tyr:00628] Signal: Bus Error (10)
>>> [tyr:00628] Signal code: Invalid address alignment (1)
>>> [tyr:00628] Failing at address: ffffffff7fffd73c
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_back
> trace_print+0x1c
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:0x1a4960
>>> /lib/sparcv9/libc.so.1:0xd8b98
>>> /lib/sparcv9/libc.so.1:0xcc70c
>>> /lib/sparcv9/libc.so.1:0xcc918
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack_int64+0xf4 [ Signal 
>>> 2096416616 (?)]
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack_buffer+0x168
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libopen-pal.so.0.0.0:opal_dss_
> unpack+0x24c
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/openmpi/mca_pmix_native.so:0x1
> 4e10
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:ompi_mpi_init+
> 0xd18
>>> 
> /export2/prog/SunOS_sparc/openmpi-1.9_64_cc/lib64/libmpi.so.0.0.0:MPI_Init+0x26c
>>> /home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10
>>> /home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x12c
>>> [tyr:00628] *** End of error message ***
>>> --------------------------------------------------------------------------
>>> mpiexec noticed that process rank 0 with PID 628 on node tyr exited on 
> signal 10 (Bus Error).
>>> --------------------------------------------------------------------------
>>> [
>> 
> 

Reply via email to