I'm on the road the rest of this week, but can look at this when I return
next week. It looks like something unrelated to the Java bindings failed to
properly initialize - at a guess, I'd suspect that you are missing the
LD_LIBRARY_PATH setting so none of the OMPI libs were found.

On Wed, Sep 26, 2012 at 5:42 AM, Siegmar Gross <
siegmar.gr...@informatik.hs-fulda.de> wrote:

> Hi,
>
> yesterday I installed openmpi-1.9a1r27362 on Solaris and Linux and
> I have a problem with mpiJava on Linux (openSUSE-Linux 12.1, x86_64).
>
>
> linpc4 mpi_classfiles 104 javac HelloMainWithoutMPI.java
> linpc4 mpi_classfiles 105 mpijavac HelloMainWithBarrier.java
> linpc4 mpi_classfiles 106 mpijavac -showme
> /usr/local/jdk1.7.0_07-64/bin/javac \
>   -cp ...:.:/usr/local/openmpi-1.9_64_cc/lib64/mpi.jar
>
>
> It works with Java without MPI.
>
> linpc4 mpi_classfiles 107 mpiexec java -cp $HOME/mpi_classfiles \
>   HelloMainWithoutMPI
> Hello from linpc4.informatik.hs-fulda.de/193.174.26.225
>
>
> It breaks with Java and MPI.
>
> linpc4 mpi_classfiles 108 mpiexec java -cp $HOME/mpi_classfiles \
>   HelloMainWithBarrier
> --------------------------------------------------------------------------
> It looks like opal_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during opal_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
>   mca_base_open failed
>   --> Returned value -2 instead of OPAL_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
>   opal_init failed
>   --> Returned value Out of resource (-2) instead of ORTE_SUCCESS
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> It looks like MPI_INIT failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during MPI_INIT; some of which are due to configuration or environment
> problems.  This failure appears to be an internal failure; here's some
> additional information (which may only be relevant to an Open MPI
> developer):
>
>   ompi_mpi_init: orte_init failed
>   --> Returned "Out of resource" (-2) instead of "Success" (0)
> --------------------------------------------------------------------------
> *** An error occurred in MPI_Init
> *** on a NULL communicator
> *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
> ***    and potentially your MPI job)
> [linpc4:15332] Local abort before MPI_INIT completed successfully; not
> able to
> aggregate error messages, and not able to guarantee that all other
> processes were
> killed!
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status,
> thus
> causing
> the job to be terminated. The first process to do so was:
>
>   Process name: [[58875,1],0]
>   Exit code:    1
> --------------------------------------------------------------------------
>
>
> I configured with the following command.
>
> ../openmpi-1.9a1r27362/configure --prefix=/usr/local/openmpi-1.9_64_cc \
>   --libdir=/usr/local/openmpi-1.9_64_cc/lib64 \
>   --with-jdk-bindir=/usr/local/jdk1.7.0_07-64/bin \
>   --with-jdk-headers=/usr/local/jdk1.7.0_07-64/include \
>   JAVA_HOME=/usr/local/jdk1.7.0_07-64 \
>   LDFLAGS="-m64" \
>   CC="cc" CXX="CC" FC="f95" \
>   CFLAGS="-m64" CXXFLAGS="-m64 -library=stlport4" FCFLAGS="-m64" \
>   CPP="cpp" CXXCPP="cpp" \
>   CPPFLAGS="" CXXCPPFLAGS="" \
>   C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
>   OBJC_INCLUDE_PATH="" OPENMPI_HOME="" \
>   --enable-cxx-exceptions \
>   --enable-mpi-java \
>   --enable-heterogeneous \
>   --enable-opal-multi-threads \
>   --enable-mpi-thread-multiple \
>   --with-threads=posix \
>   --with-hwloc=internal \
>   --without-verbs \
>   --without-udapl \
>   --with-wrapper-cflags=-m64 \
>   --enable-debug \
>   |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_cc
>
>
> It works fine on Solaris machines as long as the hosts belong to the
> same kind (Sparc or x86_64).
>
> tyr mpi_classfiles 194 mpiexec -host sunpc0,sunpc1,sunpc4 \
>   java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> Process 1 of 3 running on sunpc1
> Process 2 of 3 running on sunpc4.informatik.hs-fulda.de
> Process 0 of 3 running on sunpc0
>
> sunpc4 fd1026 107 mpiexec -host tyr,rs0,rs1 \
>   java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> Process 1 of 3 running on rs0.informatik.hs-fulda.de
> Process 2 of 3 running on rs1.informatik.hs-fulda.de
> Process 0 of 3 running on tyr.informatik.hs-fulda.de
>
>
> It breaks if the hosts belong to both kinds of machines.
>
> sunpc4 fd1026 106 mpiexec -host tyr,rs0,sunpc1 \
>   java -cp $HOME/mpi_classfiles HelloMainWithBarrier
> [rs0.informatik.hs-fulda.de:7718] *** An error occurred in MPI_Comm_dup
> [rs0.informatik.hs-fulda.de:7718] *** reported by process [565116929,1]
> [rs0.informatik.hs-fulda.de:7718] *** on communicator MPI_COMM_WORLD
> [rs0.informatik.hs-fulda.de:7718] *** MPI_ERR_INTERN: internal error
> [rs0.informatik.hs-fulda.de:7718] *** MPI_ERRORS_ARE_FATAL (processes
>   in this communicator will now abort,
> [rs0.informatik.hs-fulda.de:7718] ***    and potentially your MPI job)
> [sunpc4.informatik.hs-fulda.de:07900] 1 more process has sent help
>   message help-mpi-errors.txt / mpi_errors_are_fatal
> [sunpc4.informatik.hs-fulda.de:07900] Set MCA parameter
>   "orte_base_help_aggregate" to 0 to see all help / error messages
>
>
> Please let me know if I can provide anything else to track these errors.
> Thank you very much for any help in advance.
>
>
> Kind regards
>
> Siegmar
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to