[clement@kfc TestMPI]$ mpirun -d -np 2 test
[kfc:29199] procdir: (null)
[kfc:29199] jobdir: (null)
[kfc:29199] unidir: /tmp/openmpi-sessions-clement@kfc_0/default-universe
[kfc:29199] top: openmpi-sessions-clement@kfc_0
[kfc:29199] tmp: /tmp
[kfc:29199] [0,0,0] setting up session dir with
[kfc:29199] tmpdir /tmp
[kfc:29199] universe default-universe-29199
[kfc:29199] user clement
[kfc:29199] host kfc
[kfc:29199] jobid 0
[kfc:29199] procid 0
[kfc:29199] procdir:
/tmp/openmpi-sessions-clement@kfc_0/default-universe-29199/0/0
[kfc:29199] jobdir:
/tmp/openmpi-sessions-clement@kfc_0/default-universe-29199/0
[kfc:29199] unidir:
/tmp/openmpi-sessions-clement@kfc_0/default-universe-29199
[kfc:29199] top: openmpi-sessions-clement@kfc_0
[kfc:29199] tmp: /tmp
[kfc:29199] [0,0,0] contact_file
/tmp/openmpi-sessions-clement@kfc_0/default-universe-29199/universe-setup.txt
[kfc:29199] [0,0,0] wrote setup file
[kfc:29199] pls:rsh: local csh: 0, local bash: 1
[kfc:29199] pls:rsh: assuming same remote shell as local shell
[kfc:29199] pls:rsh: remote csh: 0, remote bash: 1
[kfc:29199] pls:rsh: final template argv:
[kfc:29199] pls:rsh: ssh <template> orted --debug --bootproxy 1
--name <template> --num_procs 2 --vpid_start 0 --nodename <template>
--universe clement@kfc:default-universe-29199 --nsreplica
"0.0.0;tcp://192.168.11.101:32784" --gprreplica
"0.0.0;tcp://192.168.11.101:32784" --mpi-call-yield 0
[kfc:29199] pls:rsh: launching on node localhost
[kfc:29199] pls:rsh: oversubscribed -- setting mpi_yield_when_idle to 1
(1 2)
[kfc:29199] sess_dir_finalize: proc session dir not empty - leaving
[kfc:29199] spawn: in job_state_callback(jobid = 1, state = 0xa)
mpirun noticed that job rank 1 with PID 0 on node "localhost" exited on
signal 11.
[kfc:29199] sess_dir_finalize: proc session dir not empty - leaving
[kfc:29199] spawn: in job_state_callback(jobid = 1, state = 0x9)
[kfc:29199] ERROR: A daemon on node localhost failed to start as expected.
[kfc:29199] ERROR: There may be more information available from
[kfc:29199] ERROR: the remote shell (see above).
[kfc:29199] The daemon received a signal 11.
1 additional process aborted (not shown)
[kfc:29199] sess_dir_finalize: found proc session dir empty - deleting
[kfc:29199] sess_dir_finalize: found job session dir empty - deleting
[kfc:29199] sess_dir_finalize: found univ session dir empty - deleting
[kfc:29199] sess_dir_finalize: top session dir not empty - leaving
opmi_info output message:
[clement@kfc TestMPI]$ ompi_info
Open MPI: 1.0rc5r8053
Open MPI SVN revision: r8053
Open RTE: 1.0rc5r8053
Open RTE SVN revision: r8053
OPAL: 1.0rc5r8053
OPAL SVN revision: r8053
Prefix: /home/clement/openmpi
Configured architecture: i686-pc-linux-gnu
Configured by: clement
Configured on: Fri Nov 11 00:37:23 EST 2005
Configure host: kfc
Built by: clement
Built on: Fri Nov 11 00:59:26 EST 2005
Built host: kfc
C bindings: yes
C++ bindings: yes
Fortran77 bindings: yes (all)
Fortran90 bindings: yes
C compiler: gcc
C compiler absolute: /usr/bin/gcc
C++ compiler: g++
C++ compiler absolute: /usr/bin/g++
Fortran77 compiler: gfortran
Fortran77 compiler abs: /usr/bin/gfortran
Fortran90 compiler: gfortran
Fortran90 compiler abs: /usr/bin/gfortran
C profiling: yes
C++ profiling: yes
Fortran77 profiling: yes
Fortran90 profiling: yes
C++ exceptions: no
Thread support: posix (mpi: no, progress: no)
Internal debug support: no
MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
libltdl support: 1
MCA memory: malloc_hooks (MCA v1.0, API v1.0, Component v1.0)
MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.0)
MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.0)
MCA timer: linux (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.0)
MCA coll: self (MCA v1.0, API v1.0, Component v1.0)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.0)
MCA io: romio (MCA v1.0, API v1.0, Component v1.0)
MCA mpool: sm (MCA v1.0, API v1.0, Component v1.0)
MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.0)
MCA pml: teg (MCA v1.0, API v1.0, Component v1.0)
MCA pml: uniq (MCA v1.0, API v1.0, Component v1.0)
MCA ptl: self (MCA v1.0, API v1.0, Component v1.0)
MCA ptl: sm (MCA v1.0, API v1.0, Component v1.0)
MCA ptl: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA btl: self (MCA v1.0, API v1.0, Component v1.0)
MCA btl: sm (MCA v1.0, API v1.0, Component v1.0)
MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.0)
MCA gpr: null (MCA v1.0, API v1.0, Component v1.0)
MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.0)
MCA gpr: replica (MCA v1.0, API v1.0, Component v1.0)
MCA iof: proxy (MCA v1.0, API v1.0, Component v1.0)
MCA iof: svc (MCA v1.0, API v1.0, Component v1.0)
MCA ns: proxy (MCA v1.0, API v1.0, Component v1.0)
MCA ns: replica (MCA v1.0, API v1.0, Component v1.0)
MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.0)
MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.0)
MCA ras: localhost (MCA v1.0, API v1.0, Component v1.0)
MCA ras: slurm (MCA v1.0, API v1.0, Component v1.0)
MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.0)
MCA rds: resfile (MCA v1.0, API v1.0, Component v1.0)
MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.0)
MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.0)
MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.0)
MCA rml: oob (MCA v1.0, API v1.0, Component v1.0)
MCA pls: fork (MCA v1.0, API v1.0, Component v1.0)
MCA pls: proxy (MCA v1.0, API v1.0, Component v1.0)
MCA pls: rsh (MCA v1.0, API v1.0, Component v1.0)
MCA pls: slurm (MCA v1.0, API v1.0, Component v1.0)
MCA sds: env (MCA v1.0, API v1.0, Component v1.0)
MCA sds: pipe (MCA v1.0, API v1.0, Component v1.0)
MCA sds: seed (MCA v1.0, API v1.0, Component v1.0)
MCA sds: singleton (MCA v1.0, API v1.0, Component v1.0)
MCA sds: slurm (MCA v1.0, API v1.0, Component v1.0)
[clement@kfc TestMPI]$
Jeff Squyres wrote:
I'm sorry -- I wasn't entirely clear:
1. Are you using a 1.0 nightly tarball or a 1.1 nightly tarball? We
have made a bunch of fixes to the 1.1 tree (i.e., the Subversion
trunk), but have not fully vetted them yet, so they have not yet been
taken to the 1.0 release branch yet. If you have not done so already,
could you try a tarball from the trunk?
http://www.open-mpi.org/nightly/trunk/
2. The error you are seeing looks like a proxy process is failing to
start because it seg faults. Are you getting corefiles? If so, can
you send the backtrace? The corefile should be from the
$prefix/bin/orted executable.
3. Failing that, can you run with the "-d" switch? It should give a
bunch of debugging output that might be helpful. "mpirun -d -np 2
./test", for example.
4. Also please send the output of the "ompi_info" command.
On Nov 10, 2005, at 9:05 AM, Clement Chu wrote:
I have tried the latest version (rc5 8053), but the error is still
here.
Jeff Squyres wrote:
We've actually made quite a few bug fixes since RC4 (RC5 is not
available yet). Would you mind trying with a nightly snapshot
tarball?
(there were some SVN commits last night after the nightly snapshot was
made; I've just initiated another snapshot build -- r8085 should be on
the web site within an hour or so)
On Nov 10, 2005, at 4:38 AM, Clement Chu wrote:
Hi,
I got an error when tried the mpirun on mpi program. The following
is
the error message:
[clement@kfc TestMPI]$ mpicc -g -o test main.c
[clement@kfc TestMPI]$ mpirun -np 2 test
mpirun noticed that job rank 1 with PID 0 on node "localhost" exited
on
signal 11.
[kfc:28466] ERROR: A daemon on node localhost failed to start as
expected.
[kfc:28466] ERROR: There may be more information available from
[kfc:28466] ERROR: the remote shell (see above).
[kfc:28466] The daemon received a signal 11.
1 additional process aborted (not shown)
[clement@kfc TestMPI]$
I am using openmpi-1.0rc4 and running on Linux Redhat Fedora Core 4.
The kernal is 2.6.12-1.1456_FC4. My building procedure is as below:
1. ./configure --prefix=/home/clement/openmpi --with-devel-headers
2. make all install
3. login root. add openmpi's path and lib in /etc/bashrc
4. see the $PATH and $LD_LIBRARY_PATH as below
[clement@kfc TestMPI]$ echo $PATH
/usr/java/jdk1.5.0_05/bin:/home/clement/openmpi/bin:/usr/java/
jdk1.5.0_05/bin:/home/clement/mpich-1.2.7/bin:/usr/kerberos/bin:/usr/
local/bin:/usr/bin:/bin:/usr/X11R6/bin:/home/clement/bin
[clement@kfc TestMPI]$ echo $LD_LIBRARY_PATH
/home/clement/openmpi/lib
[clement@kfc TestMPI]$
5. go to mpi program's directory
6. mpicc -g -o test main.c
7. mpirun -np 2 test
Any idea of this problem. Many thanks.
--
Clement Kam Man Chu
Research Assistant
School of Computer Science & Software Engineering
Monash University, Caulfield Campus
Ph: 61 3 9903 1964