Hello all,
I am sorry to ask what is probably a newbie question, I have searched the
archives but am probably not using the proper key word to locate.
I am working on an atmospheric model which uses openmpi/openrte. I have two
nodes setup but the model only runs on one node.
I can use mpirun to execute an application on another node by entering the
below on HOST1:
mpirun --np 2 --host HOST2 APPNAME
In this scenario, the system connects via ssh to HOST2 and runs the application
without a problem.
If I attempt to run:
mpirun --np 2 --nolocal APPNAME
I get:
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in
file base/rmaps_base_support_fns.c at line 168
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in
file rmaps_rr.c at line 402
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in
file base/rmaps_base_map_job.c at line 210
[virtualModel1:03939] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in
file rmgr_urm.c at line 372
[virtualModel1:03939] mpirun: spawn failed with errno=-3
Looking at the source code, that is the area where the available nodes are
enumerated and this error appears to indicate no "non-local" node is available
if I am interpreting this correctly.
I have the hosts file correct along with the ssh key so the user can login
without a password etc etc. I don't know where the system looks for
identification of node IPs so this can be enumerated.
Can someone give me a quick pointer to the correct location in the manual (I
realize the answer is RTM but I have not found the answer in the manual thus
far so I figured I would throw it out there to the experts).
Thanks for your patience with my query.