On 10-Aug-09, at 8:03 PM, Ralph Castain wrote:

Interesting! Well, I always make sure I have my personal OMPI build before any system stuff, and I work exclusively on Mac OS-X:

I am still finding this very mysterious....

I have removed all the OS-X -supplied libraries, recompiled and installed openmpi 1.3.3, and I am *still* getting this warning when running ompi_info:

[saturna.cluster:50307] mca: base: component_find: iof "mca_iof_proxy" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: iof "mca_iof_svc" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: ras "mca_ras_dash_host" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: ras "mca_ras_hostfile" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: ras "mca_ras_localhost" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: ras "mca_ras_xgrid" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:50307] mca: base: component_find: rcache "mca_rcache_rb" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored

So, I guess I'm not clear how the library can be an issue...

I *do* get another error from running the gcm that I do not get from running simpler jobs - hopefully this helps explain things:

[xserve03.local][[61029,1],4][btl_tcp_endpoint.c: 486:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier [[61029,1],3]

The processes are running, the mitgcmuv processes are running on the xserves, and using considerable resources! They open STDERR/STDOUT but nothing is flushed into them, including the few print statements I've put in before and after MPI_INIT (as Ralph suggested).

On 11-Aug-09, at 4:17 AM, Ashley Pittman wrote:

If you suspect a hang then you can use the command orte-ps (on the node where the mpirun is running) and it should show you your job. This will
tell you if the job is started and still running or if there was a
problem launching.

/usr/local/openmpi/bin/orte-ps
[saturna.cluster:51840] mca: base: component_find: iof "mca_iof_proxy" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored [saturna.cluster:51840] mca: base: component_find: iof "mca_iof_svc" uses an MCA interface that is not recognized (component MCA v1.0.0 != supported MCA v2.0.0) -- ignored


Information from mpirun [61029,0]
-----------------------------------

    JobID |   State |  Slots | Num Procs |
------------------------------------------
[61029,1] | Running |      2 |        16 |
Process Name | ORTE Name | Local Rank | PID | Node | State |
        
-------------------------------------------------------------------------------
../build/mitgcmuv | [[61029,1],0] | 0 | 40206 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],1] | 0 | 40005 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],2] | 1 | 40207 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],3] | 1 | 40006 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],4] | 2 | 40208 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],5] | 2 | 40007 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],6] | 3 | 40209 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],7] | 3 | 40008 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],8] | 4 | 40210 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],9] | 4 | 40009 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],10] | 5 | 40211 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],11] | 5 | 40010 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],12] | 6 | 40212 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],13] | 6 | 40011 | xserve04 | Running | ../build/mitgcmuv | [[61029,1],14] | 7 | 40213 | xserve03 | Running | ../build/mitgcmuv | [[61029,1],15] | 7 | 40012 | xserve04 | Running |

Thanks,  Jody



Reply via email to