Hey,
I downloaded compiled, and installed open mpi off the web site but
was having problems. I then installed open mpi and ins libs via yum.
Adter that I was able to compile and run mpi programs locally but ran
into problems when I tried to run them across two computers (see
output below). I was told that this was probably because I have two
different copies of openmpi. I uninstalled openmpi via yum and am now
trying to figure out the best way to uninstall the compiled version of
openmpi to reinstall with a clean slate. Any suggestions?
-Jacob
On Jun 2, 2009, at 4:45 AM, Jeff Squyres wrote:
This looks like you have two different versions of Open MPI installed
on your two machines (it's hard to tell with the name
"localhost.localdomain", though -- can you name your two computers
differently so that you can tell them apart in the output?).
You need to have the same version of Open MPI installed on both
machines.
On Jun 2, 2009, at 3:52 AM, jacob Balthazor wrote:
Hi,
I am just getting my feet wet with openmpi and can't seem to
get it running. I installed openmpi and all it's components via yum
and am able compile and run programs with mpi locally but not across
the two computers. I set up the keyed ssh on both mechines and am
able to log into another without asking for a password. From reading
online it looks like my problem may stem from an
unconfigured .bash_profile as I don't think yum would have
configured it for me. My question is where does yum stick the bin
and lib files that I need to reference in my profile? What should my
bash_profile look like? Thank you for reading and am eagerly
awaiting your reply.
-Jacob
My Profile:
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
When I try to run it:
[beowulf2@localhost Desktop]$ mpirun --hostfile hostfile -np 4 a.out
[localhost.localdomain:08564] Error: unknown option "--bootproxy"
input in flex scanner failed
[localhost.localdomain:06428] [0,0,0] ORTE_ERROR_LOG: Timeout in
file base/pls_base_orted_cmds.c at line 275
[localhost.localdomain:06428] [0,0,0] ORTE_ERROR_LOG: Timeout in
file pls_rsh_module.c at line 1166
[localhost.localdomain:06428] [0,0,0] ORTE_ERROR_LOG: Timeout in
file errmgr_hnp.c at line 90
[localhost.localdomain:06428] ERROR: A daemon on node 192.168.0.3
failed to start as expected.
[localhost.localdomain:06428] ERROR: There may be more information
available from
[localhost.localdomain:06428] ERROR: the remote shell (see above).
[localhost.localdomain:06428] ERROR: The daemon exited unexpectedly
with status 2.
[localhost.localdomain:06428] [0,0,0] ORTE_ERROR_LOG: Timeout in
file base/pls_base_orted_cmds.c at line 188
[localhost.localdomain:06428] [0,0,0] ORTE_ERROR_LOG: Timeout in
file pls_rsh_module.c at line 1198
------------------------------
--------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job.
Returned value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
[beowulf2@localhost Desktop]$