Lloyd Brown <lloyd_br...@byu.edu> writes:

> No problem.  It wasn't much of a delay.
>
> The scenario involves a combination of MPI and OpenMP (or other
> threading scheme).  Basically, the software will launch one or more
> processes via MPI, which then spawn threads to do the work.
>
> What we've been seeing is that, without something like '--bind-to none'
> or similar, those threads end up being pinned to the same processor as
> the process that spawned them.

The default binding is supposed to be to sockets, as --report-bindings
should show.  Otherwise see another message I just posted to for an
empirical test (and possibly examples in the tutorials referenced -- I
don't remember).

> We're okay with a bind=none, since we already have cgroups in place to
> constrain the user to the resources they request.  We might get more
> process/thread migration between processors (but within the cgroup) than
> we would like, but that's still probably acceptable in this scenario.
>
> If there's a better solution, we'd love to hear it.

--cpus-per-proc, or whatever the non-deprecated version is in mpirun(1).
[You needed --loadbalance in OMPI 1.6 to make that work.]

You might also like to supply environment variables to get the OpenMP
runtime to DTRT for thread affinity, if it doesn't; there isn't an OMPI
mechanism for that but you can do it with a wrapper or simple LD_PRELOAD
library.

Reply via email to