On 10-Aug-09, at 8:03 PM, Ralph Castain wrote:
Interesting! Well, I always make sure I have my personal OMPI build
before any system stuff, and I work exclusively on Mac OS-X:
Note that I always configure with --prefix=somewhere-in-my-own-dir,
never to a system directory. Avoids this kind of confusion.
Yeah, I did configure --prefix=/usr/local/openmpi
What the errors are saying is that we are picking up components from
a very old version of OMPI that is distributed by Apple. It may or
may not be causing confusion for the system - hard to tell. However,
the fact that it is the IO forwarding subsystem that is picking them
up, and the fact that you aren't seeing any output from your job,
makes me a tad suspicious.
Me too!
Can you run other jobs? In other words, do you get stdout/stderr
from other programs you run, or does every MPI program hang (even
simple ones)? If it is just your program, then it could just be that
your application is hanging before any output is generated. Can you
have it print something to stderr right when it starts?
No simple ones, like the examples I gave before, run fine, just with
the suspicious warnings.
I'm running a big general circulation model (MITgcm). Under normal
conditions it spits something out almost right away, and that is not
being done here. STDOUT.0001 etc are all opened, but nothing is put
into them.
I'm pretty sure I'm compliling the gcm properly:
otool -L mitgcmuv
mitgcmuv:
/usr/local/openmpi/lib/libmpi_f77.0.dylib (compatibility version
1.0.0, current version 1.0.0)
/usr/local/openmpi/lib/libmpi.0.dylib (compatibility version 1.0.0,
current version 1.0.0)
/usr/local/openmpi/lib/libopen-rte.0.dylib (compatibility version
1.0.0, current version 1.0.0)
/usr/local/openmpi/lib/libopen-pal.0.dylib (compatibility version
1.0.0, current version 1.0.0)
/usr/lib/libutil.dylib (compatibility version 1.0.0, current version
1.0.0)
/usr/local/lib/libgfortran.3.dylib (compatibility version 4.0.0,
current version 4.0.0)
/usr/local/lib/libgcc_s.1.dylib (compatibility version 1.0.0, current
version 1.0.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current
version 111.1.3)
Thanks, Jody
On Aug 10, 2009, at 8:53 PM, Klymak Jody wrote:
On 10-Aug-09, at 6:44 PM, Ralph Castain wrote:
Check your LD_LIBRARY_PATH - there is an earlier version of OMPI
in your path that is interfering with operation (i.e., it comes
before your 1.3.3 installation).
Hmmmm, The OS X faq says not to do this:
"Note that there is no need to add Open MPI's libdir to
LD_LIBRARY_PATH; Open MPI's shared library build process
automatically uses the "rpath" mechanism to automatically find the
correct shared libraries (i.e., the ones associated with this
build, vs., for example, the OS X-shipped OMPI shared libraries).
Also note that we specifically do not recommend adding Open MPI's
libdir to DYLD_LIBRARY_PATH."
http://www.open-mpi.org/faq/?category=osx
Regardless, if I set either, and run ompi_info I still get:
[saturna.cluster:94981] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:94981] mca: base: component_find: iof
"mca_iof_svc" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
echo $DYLD_LIBRARY_PATH $LD_LIBRARY_PATH
/usr/local/openmpi/lib: /usr/local/openmpi/lib:
So I'm afraid I'm stumped again. I suppose I could go clean out
all the libraries in /usr/lib/...
Thanks again, sorry to be a pain...
Cheers, Jody
On Aug 10, 2009, at 7:38 PM, Klymak Jody wrote:
So,
mpirun --display-allocation -pernode --display-map hostname
gives me the output below. Simple jobs seem to run, but the
MITgcm does not, either under ssh or torque. It hangs at some
early point in execution before anything is written, so its hard
for me to tell what the error is. Could these MCA warnings have
anything to do with it?
I've recompiled the gcm with -L /usr/local/openmpi/lib, so
hopefully that catches the right library.
Thanks, Jody
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_dash_host" uses an MCA interface that is not recogniz
ed (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_hostfile" uses an MCA interface that is not recognize
d (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_localhost" uses an MCA interface that is not recogniz
ed (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: ras
"mca_ras_xgrid" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve02.local:38126] mca: base: component_find: iof
"mca_iof_svc" uses an MCA interface that is not recognized (co
mponent MCA v1.0.0 != supported MCA v2.0.0) -- ignored
====================== ALLOCATED NODES ======================
Data for node: Name: xserve02.local Num slots: 8 Max slots: 0
Data for node: Name: xserve01.local Num slots: 8 Max slots: 0
=================================================================
======================== JOB MAP ========================
Data for node: Name: xserve02.local Num procs: 1
Process OMPI jobid: [20967,1] Process rank: 0
Data for node: Name: xserve01.local Num procs: 1
Process OMPI jobid: [20967,1] Process rank: 1
=============================================================
[xserve01.cluster:38518] mca: base: component_find: iof
"mca_iof_proxy" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[xserve01.cluster:38518] mca: base: component_find: iof
"mca_iof_svc" uses an MCA interface that is not recognized (
component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
xserve02.local
xserve01.cluster
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users