Paul Kapinos wrote:
Hi Jeff again!
(update) it works with "truly" OpenMPI, but it works *not* with SUN
Cluster Tools 8.0 (which is also an OpenMPI). So, it seems be an SUN
problem and not general problem of openMPI. Sorry for false relating
the problem.
Ah, gotcha. I guess my Sun colleagues on this list will need to
address that. ;-)
I hope!
The only trouble we have now are the error messages like
--------------------------------------------------------------------------
Sorry! You were supposed to get help about:
no hca params found
from the file:
help-mpi-btl-openib.txt
But I couldn't find any file matching that name. Sorry!
--------------------------------------------------------------------------
(the job still runs without problems! :o)
if running openmpi from new location, and the old location being
removed. (if the old location being also persistense there is no
error, so it seems to be an attempt to access to an file on old path).
Doh; that's weird.
Maybe we have to explicitly pass the OPAL_PREFIX environment
variable to all processes?
Hmm. I don't need to do this in my 1.2.7 installation. I do
something like this (I assume you're using rsh/ssh as a launcher?):
We use zsh as login shell, ssh as communication protocol and an
wrapper to mpiexec which produces an command somewhat like
/opt/MPI/openmpi-1.2.7/linux64/intel/bin/mpiexec -x LD_LIBRARY_PATH -x
PATH -x MPI_NAME --hostfile
/tmp/pk224850/26654@linuxhtc01/hostfile3564 -n 2 MPI_FastTest.exe
(hostfiles are generated temporarely by our wrapper due of load
balancing, and /opt/MPI/openmpi-1.2.7/linux64/intel/ is the path to
our local installation of OpenMPI... )
You see that we also explicitly order OpenMPI to export environment
variables PATH and LD_LIBRARY_PATH.
If we add an " -x OPAL_PREFIX " flag, and through forces explicitly
forwarding of this environment variable, the error was not occured. So
we mean that this variable is needed to be exported across *all*
systhems in cluster.
It seems, the variable OPAL_PREFIX will *NOT* be automatically
exported to new processes on the local and remote nodes.
Maybe the FAQ in
http://www.open-mpi.org/faq/?category=building#installdirs should be
extended in this mean?
Did you (or anyone reading this message) have any contact to SUN
developers to point to this circumstance? *Why* do them use
hard-coded paths? :o)
I don't know -- this sounds like an issue with the Sun CT 8 build
process. It could also be a by-product of using the combined 32/64
feature...? I haven't used that in forever and I don't remember the
restrictions. Terry/Rolf -- can you comment?
I will write an separate eMail to ct-feedb...@sun.com
Hi Paul:
Yes, there are Sun people on this list! We originally put those
hardcoded paths in to make everything work correctly out of the box and
our install process ensured that everything would be at
/opt/SUNWhpc/HPC8.0. However, let us take a look at everything that was
just discussed here and see what we can do. We will get back to you
shortly.
Rolf