Re: [OMPI users] LAMA error - mpirun segfault

Nils Smeds Mon, 10 Aug 2015 06:33:21 -0400 (EDT)

Thanks Ralph,

I'm trying to find out what can be accomplished in binding using the
command-line and when I need to generate a mapping file. Using the
command line I find is typically more robust. It is just too easy to
forget to adapt a mapping script when moving between systems.


For the sake of argument one use case would be to map a hybrid
MPI+threads application onto a set of dedicated nodes. That is, to have
MPI processes evenly distributed over hosts, sockets and NuMA domains,
but ranked compactly (MPI tasks that are close in rank in
MPI_COMM_WORLD are more likely to be in the same NuMA domain).
Processes should be bound to multiple cores in the NuMA domain, but not
overlap cores with other MPI processes unless over-subscribed. The
threading library would then rebind threads to cores within this
subset.

I just happened to choose LAMA because I found some not too old
presentations on-line about it and it looked very adaptable. If there
are other schemes that are in better shape I'd be happy to use whatever
is recommended.

With the general command line options -map-by -bind-to I have not been
able to have the MPI processes bound to a subset of the cores in the
NuMA region, only to the whole NuMA node or a single core. The more
detailed --mca rmaps*  options I have not yet found a good presentation
about so I am not sure what I can do with it and how.

Pointers to what can be done with parts of OpenMPI that is actively
maintained are most welcome.

Cheers

Re: [OMPI users] LAMA error - mpirun segfault

Reply via email to