Hello Ralph,

Thanks for your reply.

In order to start my job, I tried the following two ways
(1) configured/compiled open-mpi and compiled benchmark on head node.
      submitted a pbs job.
(2) submitted an interactive job to redo config/compile on compute node.
      And then used "/path/to/mpicc -o hello hello_world.c" to compile the
benchmark.
      used "/path/tp/mpirun -np 2 /path/to/hello" to run the job.
Actually I also tried to run "/path/tp/mpirun -np 2 hostname" but got the
same error.

The configure line is pretty long.

 67 $SRCDIR/configure \
 68    --prefix=$PREFIX \
 69    --enable-static --disable-shared --disable-dlopen
--disable-pretty-print-stacktrace --disable-pty-support --disable-io-romio
--enable-contrib-no-build=libnbc,vt --enable-debug \
 70    --with-memory-manager=none --with-threads \
 71    --without-tm \
 72    --with-wrapper-ldflags="${ADD_WRAPPER_LDFLAGS}" \
 73    --with-wrapper-libs="-lnsl -lpthread -lm" \
 74    --with-platform=optimized \
 75    --with-ugni=/opt/cray/ugni/2.3-1.0400.3912.4.29.gem \
 76    --with-ugni-libdir=/opt/cray/ugni/2.3-1.0400.3912.4.29.gem/lib64  \
 77
--with-ugni-includedir=/opt/cray/gni-headers/2.1-1.0400.3906.5.1.gem/include
\
 78    --with-xpmem=/opt/cray/xpmem/0.1-2.0400.29883.4.6.gem \
 79    --with-xpmem-libdir=/opt/cray/xpmem/0.1-2.0400.29883.4.6.gem/lib64 \
 80    --enable-mem-debug --enable-mem-profile --enable-debug-symbols
--enable-binaries \
 81    --enable-picky --enable-mpi-f77 --enable-mpi-f90 --enable-mpi-cxx
--enable-mpi-cxx-seek \
 82    --without-slurm --with-memory-manager=ptmalloc2 \
 83    --with-pmi=/opt/cray/pmi/2.1.4-1.0000.8596.8.9.gem
--with-cray-pmi-ext \
 84
--enable-mca-no-build=maffinity-first_use,maffinity-libnuma,ess-cnos,filem-rsh,grpcomm-cnos,pml-dr
\
 85    ${ADD_COMPILER} \
 86    CPPFLAGS="${ADD_CPPFLAGS} -I${gniheaders}" \
 87    FFLAGS="${ADD_FFLAGS} -I${gniheaders}" \
 88    FCFLAGS="${ADD_FCFLAGS} -I/usr/include -I${gniheaders}" \
 89    CFLAGS="-I/usr/include -I${gniheaders}" \
 90    LDFLAGS="--static ${ADD_LDFLAGS} ${UGNILIBS} ${XPMEMLIBS}" \
 91    LIBS="${ADD_LIBS} -lpthread -lrt -lpthread -lm" | tee build.log

Any idea?


Bin WANG



On Mon, Mar 5, 2012 at 7:13 PM, Ralph Castain <rhc.open...@gmail.com> wrote:

> How did you attempt to start your job, and what does your configure line
> look like?
>
> Sent from my iPad
>
> On Mar 5, 2012, at 2:11 PM, bin Wang <bighead...@gmail.com> wrote:
>
> > Hello All,
> >
> > I'm trying to run the latest OpenMPI code on Jaguar.
> > (Cloned from the Open MPI Mercurial mirror of the Subversion repository)
> > The configuration and compilation of OpenMPI were fine, and benchmark
> > was also successfully compiled. I tried to launch my program using mpirun
> > within an interactive job, but it failed immediately.
> >
> > Core dump file gave me the following information.
> > ====================Error Msg=========================
> > [jaguarpf-login2:15370] [[INVALID],INVALID] ORTE_ERROR_LOG: Unable to
> start a daemon on the local
> > node in file ess_singleton_module.c at line 220
> >
> --------------------------------------------------------------------------
> > It looks like orte_init failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during orte_init; some of which are due to configuration or
> > environment problems.  This failure appears to be an internal failure;
> > here's some additional information (which may only be relevant to an
> > Open MPI developer):
> > ompi_mpi_init: orte_init failed
> > --> Returned value Unable to start a daemon on the local node (-127)
> instead of ORTE_SUCCESS
> >
> >
> --------------------------------------------------------------------------
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration33r
> environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> > ompi_mpi_init: orte_init failed
> > --> Returned "Unable to start a daemon on40he local node" (-127) instead
> of "Success" (0)
> >
> --------------------------------------------------------------------------
> > [jaguarpf-login2:15370] *** An error occurred in MPI_Init
> > [jaguarpf-login2:15370] *** reported by process [4294967295,42949No
> process In: Line: ??   PC: ??
> > [jaguarpf-login2:15370] *** on a NULL communicator
> > [jaguarpf-login2:15370] *** Unknown error
> > [jaguarpf-login2:15370] *** MPI_ERRORS_ARE_FATAL (processes in this
> communicator will now abort,
> > [jaguarpf-login2:15370] *** and potentially your MPI job)
> >
> --------------------------------------------------------------------------
> > An MPI process is aborting at a time when it cannot guarantee that all
> > of its peer processes in the job will be killed properly.  You should
> > double check that everything has shut down cleanly.
> > Reason:     Before MPI_INIT completed
> > Local host: jaguarpf-login2
> > PID:        15370
> >
> --------------------------------------------------------------------------
> > Program exited with code 01.
> > ====================Error Msg Over=====================
> >
> > There are several components under ess, but I don't know why and how the
> > singleton component was chosen.
> >
> > I hope someone could help me to compile and run openmpi successfully on
> Jaguar.
> >
> > Any comment and suggestion will be appreciated.
> >
> > Thanks,
> >
> > --Bin
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to