Hi,

yesterday I installed openmpi-1.8.2rc4r32485 on my machines
(Solaris 10 Sparc (tyr), Solaris 10 x86_64 (sunpc1),
openSUSE Linux 12.1 x86_64 (linpc1)) with Sun C 5.12. A small
Java program breaks with SIGSEV on my Solaris systems.

tyr java 118 ssh linpc1
linpc1 fd1026 101  mpiexec -np 1 java InitFinalizeMain
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library 
/usr/local/openmpi-1.8.2_64_cc/lib64/libmpi_java.so.1.2.0 which might have 
disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', 
or link it with '-z noexecstack'.
Hello!
linpc1 fd1026 102 exit
logout
tyr java 119 ssh sunpc1
sunpc1 fd1026 104  mpiexec -np 1 java InitFinalizeMain
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xfffffd7fff1d77f0, pid=24042, tid=2
...

tyr java 121 mpiexec -np 1 java InitFinalizeMain
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=21379, tid=2
...


gdb shows the following backtrace.

tyr java 124 /usr/local/gdb-7.6.1_64_gcc/bin/gdb 
/usr/local/openmpi-1.8.2_64_cc/bin/mpiexec
GNU gdb (GDB) 7.6.1
...
(gdb) run -np 1 java InitFinalizeMain
Starting program: /usr/local/openmpi-1.8.2_64_cc/bin/mpiexec -np 1 java 
InitFinalizeMain
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
[New LWP    2        ]
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=21399, tid=2
#
# JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode solaris-sparc 
compressed oops)
# Problematic frame:
# C  [libc.so.1+0x3c7f0]  strlen+0x50
#
# Failed to write core dump. Core dumps have been disabled. To enable core 
dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid21399.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 21399 on node tyr exited on signal 
6 (Abort).
--------------------------------------------------------------------------
[LWP    2         exited]
[New Thread 2        ]
[Switching to Thread 1 (LWP 1)]
sol_thread_fetch_registers: td_ta_map_id2thr: no thread can be found to satisfy 
query
(gdb) bt
#0  0xffffffff7f6173d0 in rtld_db_dlactivity () from /usr/lib/sparcv9/ld.so.1
#1  0xffffffff7f6175a8 in rd_event () from /usr/lib/sparcv9/ld.so.1
#2  0xffffffff7f618950 in lm_delete () from /usr/lib/sparcv9/ld.so.1
#3  0xffffffff7f6226bc in remove_so () from /usr/lib/sparcv9/ld.so.1
#4  0xffffffff7f624574 in remove_hdl () from /usr/lib/sparcv9/ld.so.1
#5  0xffffffff7f61d97c in dlclose_core () from /usr/lib/sparcv9/ld.so.1
#6  0xffffffff7f61d9d4 in dlclose_intn () from /usr/lib/sparcv9/ld.so.1
#7  0xffffffff7f61db0c in dlclose () from /usr/lib/sparcv9/ld.so.1
#8  0xffffffff7e8cb348 in vm_close () from 
/usr/local/openmpi-1.8.2_64_cc/lib64/libopen-pal.so.6
#9  0xffffffff7e8c8634 in lt_dlclose () from 
/usr/local/openmpi-1.8.2_64_cc/lib64/libopen-pal.so.6
#10 0xffffffff7e91edcc in ri_destructor (obj=0xff)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_component_repository.c
:391
#11 0xffffffff7e91c5a0 in opal_obj_run_destructors (object=0xffffff7c701d00ff)
    at ../../../../openmpi-1.8.2rc4r32485/opal/class/opal_object.h:446
#12 0xffffffff7e91e61c in mca_base_component_repository_release 
(component=0x10ff)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_component_repository.c
:244
#13 0xffffffff7e924c78 in mca_base_component_unload 
(component=0xffffff7f73c63800, output_id=67583)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_components_close.c:47
#14 0xffffffff7e924d1c in mca_base_component_close 
(component=0xffffff0000000100, 
    output_id=268480767)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_components_close.c:60
#15 0xffffffff7e924e2c in mca_base_components_close (output_id=1947894015, 
    components=0xffffff7f501368ff, skip=0x2ff)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_components_close.c:86
#16 0xffffffff7e924d6c in mca_base_framework_components_close 
(framework=0xffffff7d7455d4ff, 
    skip=0xffffff7f200a90ff)
    at 
../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_components_close.c:68
#17 0xffffffff7ee1d7c8 in orte_oob_base_close ()
    at ../../../../openmpi-1.8.2rc4r32485/orte/mca/oob/base/oob_base_frame.c:94
#18 0xffffffff7e954ac0 in mca_base_framework_close 
(framework=0xffffff0000004b00)
    at ../../../../openmpi-1.8.2rc4r32485/opal/mca/base/mca_base_framework.c:187
#19 0xffffffff7be139fc in rte_finalize ()
    at 
../../../../../openmpi-1.8.2rc4r32485/orte/mca/ess/hnp/ess_hnp_module.c:858
#20 0xffffffff7ec38274 in orte_finalize ()
    at ../../openmpi-1.8.2rc4r32485/orte/runtime/orte_finalize.c:65
#21 0x000000010000ddf0 in orterun (argc=3327, argv=0x0)
    at ../../../../openmpi-1.8.2rc4r32485/orte/tools/orterun/orterun.c:1096
#22 0x0000000100004614 in main (argc=255, argv=0xffffff7f078ce800)
    at ../../../../openmpi-1.8.2rc4r32485/orte/tools/orterun/main.c:13
(gdb) 


It seems that I have now the same problem for Sun C and Java which I
reported for gcc and C. The C version of my small program works fine
with Sun C.

tyr small_prog 129 mpiexec -np 1 init_finalize
Hello!
tyr small_prog 130 


I would be grateful if somebody could fix th problem. Thank you very
much for any help in advance.

Kind regards

Siegmar

Reply via email to