In your desired ordering you have rank 0 on (socket,core) (0,0) and rank 1 on (0,2). Is there an architectural reason for that? Meaning are cores 0 and 1 hardware threads in the same core, or is there a cache level (say L2 or L3) connecting cores 0 and 1 separate from cores 2 and 3?
hwloc's lstopo should give you that information if you don't have that information handy. I am asking so that I might provide you with a potentially more general solution than a rankfile. -- Josh On Wed, Nov 7, 2012 at 12:25 PM, Blosch, Edwin L <edwin.l.blo...@lmco.com> wrote: > I am trying to map MPI processes to sockets in a somewhat compacted pattern > and I am wondering the best way to do it. > > > > Say there are 2 sockets (0 and 1) and each processor has 4 cores (0,1,2,3) > and I have 4 MPI processes, each of which will use 2 OpenMP processes. > > > > I’ve re-ordered my parallel work such that pairs of ranks (0,1 and 2,3) > communicate more with each other than with other ranks. Thus I think the > best mapping would be: > > > > RANK SOCKET CORE > > 0 0 0 > > 1 0 2 > > 2 1 0 > > 3 1 2 > > > > My understanding is that --bysocket --bind-to-socket will give me ranks 0 > and 2 on socket 0 and ranks 1 and 3 on socket 1, not what I want. > > > > It looks like --cpus-per-proc might be what I want, i.e. seems like I might > give the value 2. But it was unclear to me whether I would also need to > give --bysocket and the FAQ suggests this combination is untested. > > > > May be a rankfile is what I need? > > > > I would appreciate some advice on the easiest way to get this mapping. > > > > Thanks > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Joshua Hursey Assistant Professor of Computer Science University of Wisconsin-La Crosse http://cs.uwlax.edu/~jjhursey