I'm using gridengine 6.2u5 and openmpi 1.3.3. I'm submitting a parallel job and would like to specify a rankfile to set processor binding but am getting errors.
The $PE_HOSTFILE generated by gridengine is: amos.cora.nwra.com 4 cloud...@amos.cora.nwra.com UNDEFINED andrew.cora.nwra.com 4 cloud...@andrew.cora.nwra.com UNDEFINED The rankfile I'm using is: rank 0=amos.cora.nwra.com slot=0 rank 1=andrew.cora.nwra.com slot=0 rank 2=amos.cora.nwra.com slot=4 rank 3=andrew.cora.nwra.com slot=4 rank 4=amos.cora.nwra.com slot=1 rank 5=andrew.cora.nwra.com slot=1 rank 6=amos.cora.nwra.com slot=5 rank 7=andrew.cora.nwra.com slot=5 The error I'm getting is: Rankfile claimed host amos.cora.nwra.com that was not allocated or oversubscribed it's slots: -------------------------------------------------------------------------- [amos:05727] [[44126,0],0] ORTE_ERROR_LOG: Bad parameter in file rmaps_rank_file.c at line 108 [amos:05727] [[44126,0],0] ORTE_ERROR_LOG: Bad parameter in file base/rmaps_base_map_job.c at line 87 [amos:05727] [[44126,0],0] ORTE_ERROR_LOG: Bad parameter in file base/plm_base_launch_support.c at line 77 [amos:05727] [[44126,0],0] ORTE_ERROR_LOG: Bad parameter in file plm_rsh_module.c at line 990 -------------------------------------------------------------------------- A daemon (pid unknown) died unexpectedly on signal 1 while attempting to launch so we are aborting. Any ideas? Thanks! - Orion -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA/CoRA Division FAX: 303-415-9702 3380 Mitchell Lane or...@cora.nwra.com Boulder, CO 80301 http://www.cora.nwra.com