What version of OMPI are you using? We have a "seq" mapper that does what you 
want, but the precise cmd line option for directing to use it depends a bit on 
the version.


On Apr 9, 2014, at 9:22 AM, Gan, Qi PW <qi.g...@pw.utc.com> wrote:

> Hi,
>  
> I have a problem when setting the processes of a parallel job  with specified 
> order.  Suppose a job with 6 processes (rank0 to rank5) needs to run on 3 
> hosts (A, B, C) with following order:
>         Rank0  -- A
>         Rank1  -- B
>         Rank2  -- B
>         Rank3  -- C
>         Rank4  -- A
>         Rank5  -- C
> Specifying this order (ABBCAC) in  hostfile doesn’t work because Open MPI 
> only supports “byslot” (AABBCC) or “bynode” (ABCABC) ranking orders.
>  
> However, if I use rankfile to implement this order in the format of
>         rank 0=A slot=<slot setting>
>         rank 0=B slot=<slot setting>
>         rank 0=B slot=<slot setting>
>         rank 0=C slot=<slot setting>
>         rank 0=A slot=<slot setting>
>         rank 0=C slot=<slot setting>
> I run into another problem on how to determine the <slot setting> for each 
> rank. If I bind each rank to all cores/CPUs on a node (e.g. rank 0=A 
> slot=0-n,  where n is the maximal CPU number), I run into the following 
> errors:
>  
> *** An error occurred in MPI_comm_size
> *** on a NULL communicator
> *** Unknown error
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
> forrtl: severe (174): SIGSEGV, segmentation fault occurred
>  
> If I don’t select all cores, I need to identify which cores are available to 
> my job in order to avoid CPU oversubscribing since the nodes are shared by 
> multiple jobs.  
>  
> Our system is the intel based cluster (12 or 16 cores per node) and the job 
> is submitted by LSF batch submitter.
>  
> Here is my question: how to implement a specified order of processes at node 
> level without binding at core/cpu level?
>  
> Any help and suggestions would be appreciated.
>  
> Thanks,
> Chee
>  
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to