I'm on the road the rest of this week, but can look at this when I return next week. It looks like something unrelated to the Java bindings failed to properly initialize - at a guess, I'd suspect that you are missing the LD_LIBRARY_PATH setting so none of the OMPI libs were found.
On Wed, Sep 26, 2012 at 5:42 AM, Siegmar Gross < siegmar.gr...@informatik.hs-fulda.de> wrote: > Hi, > > yesterday I installed openmpi-1.9a1r27362 on Solaris and Linux and > I have a problem with mpiJava on Linux (openSUSE-Linux 12.1, x86_64). > > > linpc4 mpi_classfiles 104 javac HelloMainWithoutMPI.java > linpc4 mpi_classfiles 105 mpijavac HelloMainWithBarrier.java > linpc4 mpi_classfiles 106 mpijavac -showme > /usr/local/jdk1.7.0_07-64/bin/javac \ > -cp ...:.:/usr/local/openmpi-1.9_64_cc/lib64/mpi.jar > > > It works with Java without MPI. > > linpc4 mpi_classfiles 107 mpiexec java -cp $HOME/mpi_classfiles \ > HelloMainWithoutMPI > Hello from linpc4.informatik.hs-fulda.de/193.174.26.225 > > > It breaks with Java and MPI. > > linpc4 mpi_classfiles 108 mpiexec java -cp $HOME/mpi_classfiles \ > HelloMainWithBarrier > -------------------------------------------------------------------------- > It looks like opal_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during opal_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > mca_base_open failed > --> Returned value -2 instead of OPAL_SUCCESS > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > It looks like orte_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during orte_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > > opal_init failed > --> Returned value Out of resource (-2) instead of ORTE_SUCCESS > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > It looks like MPI_INIT failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during MPI_INIT; some of which are due to configuration or environment > problems. This failure appears to be an internal failure; here's some > additional information (which may only be relevant to an Open MPI > developer): > > ompi_mpi_init: orte_init failed > --> Returned "Out of resource" (-2) instead of "Success" (0) > -------------------------------------------------------------------------- > *** An error occurred in MPI_Init > *** on a NULL communicator > *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort, > *** and potentially your MPI job) > [linpc4:15332] Local abort before MPI_INIT completed successfully; not > able to > aggregate error messages, and not able to guarantee that all other > processes were > killed! > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, > thus > causing > the job to be terminated. The first process to do so was: > > Process name: [[58875,1],0] > Exit code: 1 > -------------------------------------------------------------------------- > > > I configured with the following command. > > ../openmpi-1.9a1r27362/configure --prefix=/usr/local/openmpi-1.9_64_cc \ > --libdir=/usr/local/openmpi-1.9_64_cc/lib64 \ > --with-jdk-bindir=/usr/local/jdk1.7.0_07-64/bin \ > --with-jdk-headers=/usr/local/jdk1.7.0_07-64/include \ > JAVA_HOME=/usr/local/jdk1.7.0_07-64 \ > LDFLAGS="-m64" \ > CC="cc" CXX="CC" FC="f95" \ > CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > CPPFLAGS="" CXXCPPFLAGS="" \ > C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \ > OBJC_INCLUDE_PATH="" OPENMPI_HOME="" \ > --enable-cxx-exceptions \ > --enable-mpi-java \ > --enable-heterogeneous \ > --enable-opal-multi-threads \ > --enable-mpi-thread-multiple \ > --with-threads=posix \ > --with-hwloc=internal \ > --without-verbs \ > --without-udapl \ > --with-wrapper-cflags=-m64 \ > --enable-debug \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc > > > It works fine on Solaris machines as long as the hosts belong to the > same kind (Sparc or x86_64). > > tyr mpi_classfiles 194 mpiexec -host sunpc0,sunpc1,sunpc4 \ > java -cp $HOME/mpi_classfiles HelloMainWithBarrier > Process 1 of 3 running on sunpc1 > Process 2 of 3 running on sunpc4.informatik.hs-fulda.de > Process 0 of 3 running on sunpc0 > > sunpc4 fd1026 107 mpiexec -host tyr,rs0,rs1 \ > java -cp $HOME/mpi_classfiles HelloMainWithBarrier > Process 1 of 3 running on rs0.informatik.hs-fulda.de > Process 2 of 3 running on rs1.informatik.hs-fulda.de > Process 0 of 3 running on tyr.informatik.hs-fulda.de > > > It breaks if the hosts belong to both kinds of machines. > > sunpc4 fd1026 106 mpiexec -host tyr,rs0,sunpc1 \ > java -cp $HOME/mpi_classfiles HelloMainWithBarrier > [rs0.informatik.hs-fulda.de:7718] *** An error occurred in MPI_Comm_dup > [rs0.informatik.hs-fulda.de:7718] *** reported by process [565116929,1] > [rs0.informatik.hs-fulda.de:7718] *** on communicator MPI_COMM_WORLD > [rs0.informatik.hs-fulda.de:7718] *** MPI_ERR_INTERN: internal error > [rs0.informatik.hs-fulda.de:7718] *** MPI_ERRORS_ARE_FATAL (processes > in this communicator will now abort, > [rs0.informatik.hs-fulda.de:7718] *** and potentially your MPI job) > [sunpc4.informatik.hs-fulda.de:07900] 1 more process has sent help > message help-mpi-errors.txt / mpi_errors_are_fatal > [sunpc4.informatik.hs-fulda.de:07900] Set MCA parameter > "orte_base_help_aggregate" to 0 to see all help / error messages > > > Please let me know if I can provide anything else to track these errors. > Thank you very much for any help in advance. > > > Kind regards > > Siegmar > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >