Hi Ralph, I found another corner case hangup in openmpi-1.7.5rc3. Condition: 1. allocate some nodes using RM such as TORQUE. 2. request the head node only in executing the job with -host or -hostfile option.
Example: 1. allocate node05,node06 using TORQUE. 2. request node05 only with -host option [mishima@manage ~]$ qsub -I -l nodes=node05+node06 qsub: waiting for job 8661.manage.cluster to start qsub: job 8661.manage.cluster ready [mishima@node05 ~]$ cat $PBS_NODEFILE node05 node06 [mishima@node05 ~]$ mpirun -np 1 -host node05 ~/mis/openmpi/demos/myprog << hang here >> And, my fix for plm_base_launch_support.c is as follows: --- plm_base_launch_support.c 2014-03-12 05:51:45.000000000 +0900 +++ plm_base_launch_support.try.c 2014-03-18 08:38:03.000000000 +0900 @@ -1662,7 +1662,11 @@ OPAL_OUTPUT_VERBOSE((5, orte_plm_base_framework.framework_output, "%s plm:base:setup_vm only HNP left", ORTE_NAME_PRINT(ORTE_PROC_MY_NAME))); + /* cleanup */ OBJ_DESTRUCT(&nodes); + /* mark that the daemons have reported so we can proceed */ + daemons->state = ORTE_JOB_STATE_DAEMONS_REPORTED; + daemons->updated = false; return ORTE_SUCCESS; } Tetsuya