I changed error message, I hope it will be more clear now. r21919. On Tue, Sep 1, 2009 at 2:13 PM, Lenny Verkhovsky <lenny.verkhov...@gmail.com > wrote:
> please try using full ( drdb0235.en.desres.deshaw.com ) hostname > in the hostfile/rankfile. > It should help. > Lenny. > > On Mon, Aug 31, 2009 at 7:43 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> I'm afraid the rank-file mapper in 1.3.3 has several known problems that >> have been described on the list by users. We hopefully have those fixed in >> the upcoming 1.3.4 release. >> >> On Aug 31, 2009, at 10:01 AM, Sacerdoti, Federico wrote: >> >> Hi, >> >> I am trying to use the rankmap to bind a 4-proc mpi job to one socket of a >> two-socket, 8 core machine. However I'm getting a strange error. >> >> CMDS USED >> orterun --hostfile hostlist.1 -n 4 --mca rmaps_rank_file_path ./rankmap.1 >> desres-netscan -o $OUTDIR >> >> $ cat rankmap.1 >> rank 0=drdb0235.en slot=0:0 >> rank 1=drdb0235.en slot=0:1 >> rank 2=drdb0235.en slot=0:2 >> rank 3=drdb0235.en slot=0:3 >> >> $ cat hostlist.1 >> drdb0235.en slots=8 >> ERROR SEEN >> -------------------------------------------------------------------------- >> Rankfile claimed host drdb0235.en that was not allocated or oversubscribed >> it's slots: >> -------------------------------------------------------------------------- >> [drdb0235.en.desres.deshaw.com:14242] [[37407,0],0] ORTE_ERROR_LOG: Bad >> parameter in file rmaps_rank_file.c at line 108 >> [drdb0235.en.desres.deshaw.com:14242] [[37407,0],0] ORTE_ERROR_LOG: Bad >> parameter in file base/rmaps_base_map_job.c at line 87 >> [drdb0235.en.desres.deshaw.com:14242] [[37407,0],0] ORTE_ERROR_LOG: Bad >> parameter in file base/plm_base_launch_support.c at line 77 >> [drdb0235.en.desres.deshaw.com:14242] [[37407,0],0] ORTE_ERROR_LOG: Bad >> parameter in file plm_rsh_module.c at line 985 >> >> From looking at the code in rmaps_rank_file.c it seems the error occurs >> when the node-gathering code wraps twice around the hostlist. However I dont >> see why that is happening. >> >> If I specify 8 slots in the rankmap, I see a different error: Error, >> invalid rank (4) in the rankfile (./rankmap.1) >> >> Thanks, >> Federico >> >>> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >