What version of OMPI are you using? We have a "seq" mapper that does what you want, but the precise cmd line option for directing to use it depends a bit on the version.
On Apr 9, 2014, at 9:22 AM, Gan, Qi PW <qi.g...@pw.utc.com> wrote: > Hi, > > I have a problem when setting the processes of a parallel job with specified > order. Suppose a job with 6 processes (rank0 to rank5) needs to run on 3 > hosts (A, B, C) with following order: > Rank0 -- A > Rank1 -- B > Rank2 -- B > Rank3 -- C > Rank4 -- A > Rank5 -- C > Specifying this order (ABBCAC) in hostfile doesn’t work because Open MPI > only supports “byslot” (AABBCC) or “bynode” (ABCABC) ranking orders. > > However, if I use rankfile to implement this order in the format of > rank 0=A slot=<slot setting> > rank 0=B slot=<slot setting> > rank 0=B slot=<slot setting> > rank 0=C slot=<slot setting> > rank 0=A slot=<slot setting> > rank 0=C slot=<slot setting> > I run into another problem on how to determine the <slot setting> for each > rank. If I bind each rank to all cores/CPUs on a node (e.g. rank 0=A > slot=0-n, where n is the maximal CPU number), I run into the following > errors: > > *** An error occurred in MPI_comm_size > *** on a NULL communicator > *** Unknown error > *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) > forrtl: severe (174): SIGSEGV, segmentation fault occurred > > If I don’t select all cores, I need to identify which cores are available to > my job in order to avoid CPU oversubscribing since the nodes are shared by > multiple jobs. > > Our system is the intel based cluster (12 or 16 cores per node) and the job > is submitted by LSF batch submitter. > > Here is my question: how to implement a specified order of processes at node > level without binding at core/cpu level? > > Any help and suggestions would be appreciated. > > Thanks, > Chee > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users