Interesting! Well, I always make sure I have my personal OMPI build
before any system stuff, and I work exclusively on Mac OS-X:
rhc$ echo $PATH
/Library/Frameworks/Python.framework/Versions/Current/bin:/Users/rhc/
openmpi/bin:/Users/rhc/bin:/opt/local/bin:/usr/X11R6/bin:/usr/local/
bin:/opt/local/bin:/opt/local/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/
local/bin:/usr/texbin
rhc$ echo $LD_LIBRARY_PATH
/Users/rhc/openmpi/lib:/Users/rhc/lib:/opt/local/lib:/usr/X11R6/lib:/
usr/local/lib:
Note that I always configure with --prefix=somewhere-in-my-own-dir,
never to a system directory. Avoids this kind of confusion.
What the errors are saying is that we are picking up components from a
very old version of OMPI that is distributed by Apple. It may or may
not be causing confusion for the system - hard to tell. However, the
fact that it is the IO forwarding subsystem that is picking them up,
and the fact that you aren't seeing any output from your job, makes me
a tad suspicious.
Can you run other jobs? In other words, do you get stdout/stderr from
other programs you run, or does every MPI program hang (even simple
ones)? If it is just your program, then it could just be that your
application is hanging before any output is generated. Can you have it
print something to stderr right when it starts?
On Aug 10, 2009, at 8:53 PM, Klymak Jody wrote:
On 10-Aug-09, at 6:44 PM, Ralph Castain wrote:
Check your LD_LIBRARY_PATH - there is an earlier version of OMPI in
your path that is interfering with operation (i.e., it comes before
your 1.3.3 installation).
Hmmmm, The OS X faq says not to do this:
"Note that there is no need to add Open MPI's libdir to
LD_LIBRARY_PATH; Open MPI's shared library build process
automatically uses the "rpath" mechanism to automatically find the
correct shared libraries (i.e., the ones associated with this build,
vs., for example, the OS X-shipped OMPI shared libraries). Also note
that we specifically do not recommend adding Open MPI's libdir to
DYLD_LIBRARY_PATH."
http://www.open-mpi.org/faq/?category=osx
Regardless, if I set either, and run ompi_info I still get:
[saturna.cluster:94981] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:94981] mca: base: component_find: iof "mca_iof_svc"
uses an MCA interface that is not recognized (component MCA v1.0.0 !
= supported MCA v2.0.0) -- ignored
echo $DYLD_LIBRARY_PATH $LD_LIBRARY_PATH
/usr/local/openmpi/lib: /usr/local/openmpi/lib:
So I'm afraid I'm stumped again. I suppose I could go clean out all
the libraries in /usr/lib/...
Thanks again, sorry to be a pain...
Cheers, Jody
On Aug 10, 2009, at 7:38 PM, Klymak Jody wrote:
So,
mpirun --display-allocation -pernode --display-map hostname
gives me the output below. Simple jobs seem to run, but the
MITgcm does not, either under ssh or torque. It hangs at some
early point in execution before anything is written, so its hard
for me to tell what the error is. Could these MCA warnings have
anything to do with it?
I've recompiled the gcm with -L /usr/local/openmpi/lib, so
hopefully that catches the right library.
Thanks, Jody
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_dash_host" uses an MCA interface that is not recogniz
ed (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_hostfile" uses an MCA interface that is not recognize
d (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_localhost" uses an MCA interface that is not recogniz
ed (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_xgrid" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: iof
"mca_iof_svc" uses an MCA interface that is not recognized (co
mponent MCA v1.0.0 != supported MCA v2.0.0) -- ignored
====================== ALLOCATED NODES ======================
Data for node: Name: xserve02.local Num slots: 8 Max slots: 0
Data for node: Name: xserve01.local Num slots: 8 Max slots: 0
=================================================================
======================== JOB MAP ========================
Data for node: Name: xserve02.local Num procs: 1
Process OMPI jobid: [20967,1] Process rank: 0
Data for node: Name: xserve01.local Num procs: 1
Process OMPI jobid: [20967,1] Process rank: 1
=============================================================
[xserve01.cluster:38518] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve01.cluster:38518] mca: base: component_find: iof
"mca_iof_svc" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
xserve02.local
xserve01.cluster
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users