On 10-Aug-09, at 8:03 PM, Ralph Castain wrote:
Interesting! Well, I always make sure I have my personal OMPI build
before any system stuff, and I work exclusively on Mac OS-X:
I am still finding this very mysterious....
I have removed all the OS-X -supplied libraries, recompiled and
installed openmpi 1.3.3, and I am *still* getting this warning when
running ompi_info:
[saturna.cluster:50307] mca: base: component_find: iof "mca_iof_proxy"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: iof "mca_iof_svc"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_dash_host" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_hostfile" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras
"mca_ras_localhost" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: ras "mca_ras_xgrid"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:50307] mca: base: component_find: rcache
"mca_rcache_rb" uses an MCA interface that is not recognized
(component MCA v1.0.0 != supported MCA v2.0.0) -- ignored
So, I guess I'm not clear how the library can be an issue...
I *do* get another error from running the gcm that I do not get from
running simpler jobs - hopefully this helps explain things:
[xserve03.local][[61029,1],4][btl_tcp_endpoint.c:
486:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process
identifier [[61029,1],3]
The processes are running, the mitgcmuv processes are running on the
xserves, and using considerable resources! They open STDERR/STDOUT
but nothing is flushed into them, including the few print statements
I've put in before and after MPI_INIT (as Ralph suggested).
On 11-Aug-09, at 4:17 AM, Ashley Pittman wrote:
If you suspect a hang then you can use the command orte-ps (on the
node
where the mpirun is running) and it should show you your job. This
will
tell you if the job is started and still running or if there was a
problem launching.
/usr/local/openmpi/bin/orte-ps
[saturna.cluster:51840] mca: base: component_find: iof "mca_iof_proxy"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
[saturna.cluster:51840] mca: base: component_find: iof "mca_iof_svc"
uses an MCA interface that is not recognized (component MCA v1.0.0 !=
supported MCA v2.0.0) -- ignored
Information from mpirun [61029,0]
-----------------------------------
JobID | State | Slots | Num Procs |
------------------------------------------
[61029,1] | Running | 2 | 16 |
Process Name | ORTE Name | Local Rank | PID | Node
| State |
-------------------------------------------------------------------------------
../build/mitgcmuv | [[61029,1],0] | 0 | 40206 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],1] | 0 | 40005 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],2] | 1 | 40207 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],3] | 1 | 40006 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],4] | 2 | 40208 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],5] | 2 | 40007 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],6] | 3 | 40209 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],7] | 3 | 40008 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],8] | 4 | 40210 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],9] | 4 | 40009 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],10] | 5 | 40211 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],11] | 5 | 40010 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],12] | 6 | 40212 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],13] | 6 | 40011 | xserve04 |
Running |
../build/mitgcmuv | [[61029,1],14] | 7 | 40213 | xserve03 |
Running |
../build/mitgcmuv | [[61029,1],15] | 7 | 40012 | xserve04 |
Running |
Thanks, Jody