Hi, today I installed openmpi-1.8.2rc2r32288 on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with Sun C 5.12 and gcc-4.9.0. Unfortunately I have problems with both compilers on "Solaris 10 Sparc". My small program works as expected on "Solaris 10 x86_64" and Linux.
Problem with Sun C 5.12: ------------------------ tyr hello_1 128 which mpicc /usr/local/openmpi-1.8.2_64_cc/bin/mpicc tyr hello_1 129 ompi_info | grep MPI: Open MPI: 1.8.2rc2r32288 tyr hello_1 130 mpicc hello_1_mpi.c tyr hello_1 131 mpiexec -np 2 a.out Process 0 of 2 running on tyr.informatik.hs-fulda.de Now 1 slave tasks are sending greetings. Process 1 of 2 running on tyr.informatik.hs-fulda.de ld.so.1: a.out: fatal: relocation error: file /usr/local/openmpi-1.8.2_64_cc/lib64/openmpi/: symbol alloca: referenced symbol not found ld.so.1: a.out: fatal: relocation error: file /usr/local/openmpi-1.8.2_64_cc/lib64/openmpi/: symbol alloca: referenced symbol not found ---------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 28377 on node tyr exited on signal 9 (Killed). ---------------------------------------------------------------------- tyr hello_1 132 I have also a problem with the Java interface on Solaris Sparc and x86_64 with mainly the same error message. tyr java 150 mpijavac InitFinalizeMain.java tyr java 151 mpiexec -np 1 java InitFinalizeMain # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0xffffffff7ea3c7f0, pid=28585, tid=2 # # JRE version: Java(TM) SE Runtime Environment (8.0-b132) (build 1.8.0-b132) # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.0-b70 mixed mode solaris-sparc compressed oops) # Problematic frame: # C [libc.so.1+0x3c7f0] strlen+0x50 # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /home/fd1026/work/skripte/master/parallel/prog/mpi/java/hs_err_pid28585.log # # If you would like to submit a bug report, please visit: # http://bugreport.sun.com/bugreport/crash.jsp # The crash happened outside the Java Virtual Machine in native code. # See problematic frame for where to report the bug. # -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 28585 on node tyr exited on signal 6 (Abort). -------------------------------------------------------------------------- tyr java 152 It works on Linux, but displays a warning. tyr java 153 ssh linpc1 linpc1 fd1026 101 cd /home/fd1026/work/skripte/master/parallel/prog/mpi/java linpc1 java 102 mpijavac InitFinalizeMain.java linpc1 java 103 mpiexec -np 1 java InitFinalizeMain Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /usr/local/openmpi-1.8.2_64_cc/lib64/libmpi_java.so.1.2.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. Hello! linpc1 java 104 Problem with gcc-4.9.0: ----------------------- tyr hello_1 104 which mpicc /usr/local/openmpi-1.8.2_64_gcc/bin/mpicc tyr hello_1 105 ompi_info | grep MPI: Open MPI: 1.8.2rc2r32288 tyr hello_1 106 mpicc hello_1_mpi.c tyr hello_1 107 mpiexec -np 2 a.out [tyr:28540] *** Process received signal *** [tyr:28540] Signal: Bus Error (10) [tyr:28540] Signal code: Invalid address alignment (1) [tyr:28540] Failing at address: ffffffff7fffd1c4 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_backtrace_print+0x2c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:0xccfd0 /lib/sparcv9/libc.so.1:0xd8b98 /lib/sparcv9/libc.so.1:0xcc70c /lib/sparcv9/libc.so.1:0xcc918 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_db_hash.so:0x3ee8 [ Signal 10 (BUS)] /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_db_base_store+0xc8 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_decode_pidmap+0x798 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_nidmap_init+0x3cc /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_ess_env.so:0x226c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_init+0x308 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:ompi_mpi_init+0x31c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:PMPI_Init+0x2a8 /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:main+0x20 /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:_start+0x7c [tyr:28540] *** End of error message *** [tyr:28542] *** Process received signal *** [tyr:28542] Signal: Bus Error (10) [tyr:28542] Signal code: Invalid address alignment (1) [tyr:28542] Failing at address: ffffffff7fffd1c4 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_backtrace_print+0x2c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:0xccfd0 /lib/sparcv9/libc.so.1:0xd8b98 /lib/sparcv9/libc.so.1:0xcc70c /lib/sparcv9/libc.so.1:0xcc918 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_db_hash.so:0x3ee8 [ Signal 10 (BUS)] /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-pal.so.6.2.0:opal_db_base_store+0xc8 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_decode_pidmap+0x8f8 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_util_nidmap_init+0x3cc /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/openmpi/mca_ess_env.so:0x226c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libopen-rte.so.7.0.4:orte_init+0x308 /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:ompi_mpi_init+0x31c /export2/prog/SunOS_sparc/openmpi-1.8.2_64_gcc/lib64/libmpi.so.1.5.2:PMPI_Init+0x2a8 /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:main+0x20 /home/fd1026/work/skripte/master/parallel/prog/mpi/hello_1/a.out:_start+0x7c [tyr:28542] *** End of error message *** -------------------------------------------------------------------------- mpiexec noticed that process rank 1 with PID 28542 on node tyr exited on signal 10 (Bus Error). -------------------------------------------------------------------------- tyr hello_1 108 I would be grateful, if somebody could solve the problems. Please let me know if I can provide any other information. Kind regards Siegmar