Hi,

yesterday I installed openmpi-1.9a1r32657 on my machines (Solaris
10 Sparc (tyr), Solaris 10 x86_64 (sunpc0), and openSUSE Linux 12.1
x86_64 (linpc0)) with Sun C 5.12 and gcc-4.9.0.

I have the following problems with my gcc version. First once more
my problems with Java and below my problems with C. In my opinion
I have the same problems as with Sun C.



Java problem:
=============

tyr java 106 mpijavac InitFinalizeMain.java 
warning: [path] bad path element 
"/usr/local/openmpi-1.9_64_gcc/lib64/shmem.jar": no such file or directory
1 warning
tyr java 107 mpiexec -np 1 java InitFinalizeMain
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=774, tid=2
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode solaris-sparc 
compressed oops)
# Problematic frame:
# C  [libc.so.1+0x3c7f0]  strlen+0x50
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before 
starting Java again
#
# An error report file with more information is saved as:
# /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid774.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 774 on node tyr exited on signal 6 
(Abort).
--------------------------------------------------------------------------
tyr java 108 /usr/local/gdb-7.6.1_64_gcc/bin/gdb 
/usr/local/openmpi-1.9_64_gcc/bin/mpiexec 
GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from 
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/bin/orterun...done.
(gdb) run -np 1 java InitFinalizeMain 
Starting program: /usr/local/openmpi-1.9_64_gcc/bin/mpiexec -np 1 java 
InitFinalizeMain
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=791, tid=2
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode solaris-sparc 
compressed oops)
# Problematic frame:
# C  [libc.so.1+0x3c7f0]  strlen+0x50
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before 
starting Java again
#
# An error report file with more information is saved as:
# /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid791.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 791 on node tyr exited on signal 6 
(Abort).
--------------------------------------------------------------------------
[LWP    2         exited]
[New Thread 2        ]
[Switching to Thread 1 (LWP 1)]
sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy 
query
(gdb) bt
#0  0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
#1  0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
#2  0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
#3  0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
#4  0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
#5  0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
#6  0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
#7  0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
#8  0xffffffff7ec88e98 in vm_close () from 
/usr/local/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0
#9  0xffffffff7ec86478 in lt_dlclose () from 
/usr/local/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0
#10 0xffffffff7ecab5fc in ri_destructor (obj=0x1001fe750)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_component_repository.c:382
#11 0xffffffff7eca9f38 in opal_obj_run_destructors (object=0x1001fe750)
    at ../../../../openmpi-1.9a1r32657/opal/class/opal_object.h:446
#12 0xffffffff7ecaae9c in mca_base_component_repository_release (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_component_repository.c:240
#13 0xffffffff7ecad19c in mca_base_component_unload (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:47
#14 0xffffffff7ecad230 in mca_base_component_close (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:60
#15 0xffffffff7ecad304 in mca_base_components_close (output_id=-1, 
    components=0xffffffff7f146d88 <orte_oob_base_framework+80>, skip=0x0)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:86
#16 0xffffffff7ecad26c in mca_base_framework_components_close (
    framework=0xffffffff7f146d38 <orte_oob_base_framework>, skip=0x0)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:66
#17 0xffffffff7efc671c in orte_oob_base_close ()
    at ../../../../openmpi-1.9a1r32657/orte/mca/oob/base/oob_base_frame.c:98
#18 0xffffffff7ecc1b28 in mca_base_framework_close (
    framework=0xffffffff7f146d38 <orte_oob_base_framework>)
    at ../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_framework.c:187
#19 0xffffffff7be07858 in rte_finalize ()
    at ../../../../../openmpi-1.9a1r32657/orte/mca/ess/hnp/ess_hnp_module.c:857
#20 0xffffffff7ef337fc in orte_finalize ()
    at ../../openmpi-1.9a1r32657/orte/runtime/orte_finalize.c:66
#21 0x00000001000071e0 in orterun (argc=5, argv=0xffffffff7fffe108)
    at ../../../../openmpi-1.9a1r32657/orte/tools/orterun/orterun.c:1099
#22 0x0000000100003e60 in main (argc=5, argv=0xffffffff7fffe108)
    at ../../../../openmpi-1.9a1r32657/orte/tools/orterun/main.c:13
(gdb) 






C problem:
==========

tyr small_prog 115 mpiexec -np 1 --host linpc0 init_finalize
[tyr.informatik.hs-fulda.de:00815] mca_oob_tcp_accept: accept() failed: Error 0 
(11).
Hello!
tyr small_prog 116 mpiexec -np 1 --host sunpc0 init_finalize
[tyr.informatik.hs-fulda.de:00819] mca_oob_tcp_accept: accept() failed: Error 0 
(11).
Hello!
tyr small_prog 117 mpiexec -np 1 --host tyr init_finalize
select: Interrupted system call
[tyr:00825] *** Process received signal ***
[tyr:00825] Signal: Bus Error (10)
[tyr:00825] Signal code: Invalid address alignment (1)
[tyr:00825] Failing at address: ffffffff7bd1bfec
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:0xdd1d8
/lib/sparcv9/libc.so.1:0xd8b98
/lib/sparcv9/libc.so.1:0xcc70c
/lib/sparcv9/libc.so.1:0xcc918
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:opal_proc_set_name+0x1c
 [ Signal 10 (BUS)]
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/openmpi/mca_pmix_native.so:0x103d0
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/openmpi/mca_ess_pmi.so:0x2fec
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-rte.so.0.0.0:orte_init+0x624
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libmpi.so.0.0.0:ompi_mpi_init+0x3a8
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libmpi.so.0.0.0:PMPI_Init+0x2a8
/home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10
/home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x7c
[tyr:00825] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 825 on node tyr exited on signal 
10 (Bus Error).
--------------------------------------------------------------------------
tyr small_prog 118 /usr/local/gdb-7.6.1_64_gcc/bin/gdb 
/usr/local/openmpi-1.9_64_gcc/bin/mpiexec 
GNU gdb (GDB) 7.6.1
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "sparc-sun-solaris2.10".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from 
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/bin/orterun...done.
(gdb) run -np 1 --host tyr init_finalize   
Starting program: /usr/local/openmpi-1.9_64_gcc/bin/mpiexec -np 1 --host tyr 
init_finalize
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
select: Interrupted system call
[tyr:00842] *** Process received signal ***
[tyr:00842] Signal: Bus Error (10)
[tyr:00842] Signal code: Invalid address alignment (1)
[tyr:00842] Failing at address: ffffffff7bd1bfec
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:opal_backtrace_print+0x2c
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:0xdd1d8
/lib/sparcv9/libc.so.1:0xd8b98
/lib/sparcv9/libc.so.1:0xcc70c
/lib/sparcv9/libc.so.1:0xcc918
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0.0.0:opal_proc_set_name+0x1c
 [ Signal 10 (BUS)]
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/openmpi/mca_pmix_native.so:0x103d0
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/openmpi/mca_ess_pmi.so:0x2fec
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libopen-rte.so.0.0.0:orte_init+0x624
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libmpi.so.0.0.0:ompi_mpi_init+0x3a8
/export2/prog/SunOS_sparc/openmpi-1.9_64_gcc/lib64/libmpi.so.0.0.0:PMPI_Init+0x2a8
/home/fd1026/SunOS/sparc/bin/init_finalize:main+0x10
/home/fd1026/SunOS/sparc/bin/init_finalize:_start+0x7c
[tyr:00842] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 842 on node tyr exited on signal 
10 (Bus Error).
--------------------------------------------------------------------------
[LWP    2         exited]
[New Thread 2        ]
[Switching to Thread 1 (LWP 1)]
sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy 
query
(gdb) bt
#0  0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
#1  0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
#2  0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
#3  0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
#4  0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
#5  0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
#6  0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
#7  0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
#8  0xffffffff7ec88e98 in vm_close () from 
/usr/local/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0
#9  0xffffffff7ec86478 in lt_dlclose () from 
/usr/local/openmpi-1.9_64_gcc/lib64/libopen-pal.so.0
#10 0xffffffff7ecab5fc in ri_destructor (obj=0x1001fe750)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_component_repository.c:382
#11 0xffffffff7eca9f38 in opal_obj_run_destructors (object=0x1001fe750)
    at ../../../../openmpi-1.9a1r32657/opal/class/opal_object.h:446
#12 0xffffffff7ecaae9c in mca_base_component_repository_release (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_component_repository.c:240
#13 0xffffffff7ecad19c in mca_base_component_unload (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:47
#14 0xffffffff7ecad230 in mca_base_component_close (
    component=0xffffffff7b122fa8 <mca_oob_tcp_component>, output_id=-1)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:60
#15 0xffffffff7ecad304 in mca_base_components_close (output_id=-1, 
    components=0xffffffff7f146d88 <orte_oob_base_framework+80>, skip=0x0)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:86
#16 0xffffffff7ecad26c in mca_base_framework_components_close (
    framework=0xffffffff7f146d38 <orte_oob_base_framework>, skip=0x0)
    at 
../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_components_close.c:66
#17 0xffffffff7efc671c in orte_oob_base_close ()
    at ../../../../openmpi-1.9a1r32657/orte/mca/oob/base/oob_base_frame.c:98
#18 0xffffffff7ecc1b28 in mca_base_framework_close (
    framework=0xffffffff7f146d38 <orte_oob_base_framework>)
    at ../../../../openmpi-1.9a1r32657/opal/mca/base/mca_base_framework.c:187
#19 0xffffffff7be07858 in rte_finalize ()
    at ../../../../../openmpi-1.9a1r32657/orte/mca/ess/hnp/ess_hnp_module.c:857
#20 0xffffffff7ef337fc in orte_finalize ()
    at ../../openmpi-1.9a1r32657/orte/runtime/orte_finalize.c:66
#21 0x00000001000071e0 in orterun (argc=6, argv=0xffffffff7fffe0f8)
    at ../../../../openmpi-1.9a1r32657/orte/tools/orterun/orterun.c:1099
#22 0x0000000100003e60 in main (argc=6, argv=0xffffffff7fffe0f8)
    at ../../../../openmpi-1.9a1r32657/orte/tools/orterun/main.c:13
(gdb) 



I would be grateful, if somebody can fix the problem. Thank you
very much for any help in advance.


Kind regards

Siegmar

Reply via email to