In your desired ordering you have rank 0 on (socket,core) (0,0) and
rank 1 on (0,2). Is there an architectural reason for that? Meaning
are cores 0 and 1 hardware threads in the same core, or is there a
cache level (say L2 or L3) connecting cores 0 and 1 separate from
cores 2 and 3?

hwloc's lstopo should give you that information if you don't have that
information handy.

I am asking so that I might provide you with a potentially more
general solution than a rankfile.

-- Josh


On Wed, Nov 7, 2012 at 12:25 PM, Blosch, Edwin L
<edwin.l.blo...@lmco.com> wrote:
> I am trying to map MPI processes to sockets in a somewhat compacted pattern
> and I am wondering the best way to do it.
>
>
>
> Say there are 2 sockets (0 and 1) and each processor has 4 cores (0,1,2,3)
> and I have 4 MPI processes, each of which will use 2 OpenMP processes.
>
>
>
> I’ve re-ordered my parallel work such that pairs of ranks (0,1 and 2,3)
> communicate more with each other than with other ranks.  Thus I think the
> best mapping would be:
>
>
>
> RANK   SOCKET    CORE
>
> 0              0              0
>
> 1              0              2
>
> 2              1              0
>
> 3              1              2
>
>
>
> My understanding is that --bysocket --bind-to-socket will give me ranks 0
> and 2 on socket 0 and ranks 1 and 3 on socket 1, not what I want.
>
>
>
> It looks like --cpus-per-proc might be what I want, i.e. seems like I might
> give the value 2.  But it was unclear to me whether I would also need to
> give --bysocket and the FAQ suggests this combination is untested.
>
>
>
> May be a rankfile is what I need?
>
>
>
> I would appreciate some advice on the easiest way to get this mapping.
>
>
>
> Thanks
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



-- 
Joshua Hursey
Assistant Professor of Computer Science
University of Wisconsin-La Crosse
http://cs.uwlax.edu/~jjhursey

Reply via email to