Hello,
Having a bit of trouble running Open MPI 1.2 under Torque 2.1.8. My Script contains the following: ----------------------------------------------- HPCC_HOME=/home/test/hpcc-1.0.0 ncpus=`wc -l $PBS_NODEFILE` mpirun -np $ncpus $HPCC_HOME/hpcc ----------------------------------------------- When I try to run on 4 nodes, 4 cpus each I receive the following in my err file: [node003:04409] [0,0,4] ORTE_ERROR_LOG: Not found in file odls_default_module.c at line 1188 [node008:06691] [0,0,1] ORTE_ERROR_LOG: Not found in file odls_default_module.c at line 1188 [node007:04352] [0,0,2] ORTE_ERROR_LOG: Not found in file odls_default_module.c at line 1188 ------------------------------------------------------------------------ -- Failed to find or execute the following executable: Host: node007 Executable: /var/spool/torque/aux//350.wc01 Cannot continue. ------------------------------------------------------------------------ -- [no--------------------------------------------------------------------- ----- Failed to find or execute the following executable: Host: node004 Executable: /var/spool/torque/aux//350.wc01 Cannot continue. ------------------------------------------------------------------------ -- de004:04364] [0,0,3] ORTE_ERROR_LOG: Not found in file odls_default_module.c at line 1188 ------------------------------------------------------------------------ -- Failed to find or execute the following executable: Host: node003 Executable: /var/spool/torque/aux//350.wc01 Cannot continue. ------------------------------------------------------------------------ -- ------------------------------------------------------------------------ -- Failed to find or execute the following executable: Host: node008 Executable: /var/spool/torque/aux//350.wc01 Cannot continue. ------------------------------------------------------------------------ -- [node007:04352] [0,0,2] ORTE_ERROR_LOG: Not found in file orted.c at line 588 [node008:06691] [0,0,1] ORTE_ERROR_LOG: Not found in file orted.c at line 588 [node004:04364] [0,0,3] ORTE_ERROR_LOG: Not found in file orted.c at line 588 [node003:04409] [0,0,4] ORTE_ERROR_LOG: Not found in file orted.c at line 588 Has anyone seen this before? It seems odd that openmpi would be trying to execute what is effectively the host file. I stuck a sleep in to make sure the file was being distributed, and sure enough, it was there. I am able to run mvapich through torque without issue and openmpi from the command line. Cheers, Barry Evans Technical Manager OCF plc +44 (0)7970 148 121 bev...@ocf.co.uk