On Sat, 6 Sep 2014, Ralph Castain wrote:
On Sep 6, 2014, at 7:52 AM, Allin Cottrell <cottr...@wfu.edu> wrote:
On Fri, 5 Sep 2014, Ralph Castain wrote:
On Sep 5, 2014, at 3:34 PM, Allin Cottrell <cottr...@wfu.edu> wrote:
I suspect there is a new (to openmpi 1.8.N?) warning with respect to
requesting a number of MPI processes greater than the number of
"real" cores on a given machine. [...]
If you are going to treat hyperthreads as independent processors, then
you should probably set the --use-hwthreads-as-cpus flag so OMPI knows
to treat it that way
Hmm, where would I set that? (For reference) mpiexec --version gives
mpiexec (OpenRTE) 1.8.2
and if I append --use-hwthreads-as-cpus to my mpiexec command I get
mpiexec: Error: unknown option "--use-hwthreads-as-cpus"
However, via trial and error I've found that these options work: either
--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)
My apologies - the correct spelling is --use-hwthread-cpus
OK, thanks.
What's puzzling me, though, is that the use of these flags was not
necessary when, earlier this year, I was running ompi 1.6.5. Neither is
it necessary when running ompi 1.7.3 on a different machine. The
warning that's printed without these flags seems to be new.
The binding code changed during the course of the 1.7 series to provide
more fine-controlled options
Again, thanks for the info.
It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure
and non-user friendly) warning whenever I specify to mpiexec a number
of processes > the number of "real" cores [...]
Could you pass along the warning? It should only give you a warning if
the #procs > #slots as you are then oversubscribed. You can turn that
warning off by just add the oversubscribe flag to your mapping
directive
Here's what I'm seeing:
<quote>
A request was made to bind to that would result in binding more
processes than cpus on a resource:
Bind to: CORE
Node: waverley
#processes: 2
#cpus: 1
You can override this protection by adding the "overload-allowed"
option to your binding directive.
</quote>
The machine in question has two cores and four threads. The thing
that's confusing here is that I'm not aware of supplying any "binding
directive": my command line (for running on a single host) is just
this:
mpiexec -np <N> <myprogram> <myprogram-data>
[...]
You shouldn't be getting that warning if you aren't specifying a binding
option, so it looks like a bug to me. I'll check and see what's going
on. You might want to check, however, that you don't have a binding
directive hidden in your environment or default MCA param file.
I don't think that's the case: the only mca-params.conf file on my system
is the default /etc/openmpi/openmpi-mca-params.conf installed by Arch,
which is empty apart from comments, and "set | grep MCA" doesn't produce
anything.
Meantime, just use the oversubscribe or overload-allowed options to turn
it off. You can put those in the default MCA param file if you don't
want to add it to the environment or cmd line. The MCA params would be:
OMPI_MCA_rmaps_base_oversubscribe=1
If you want to bind the procs to cores, but allow two procs to share the
core (each will be bound to both hyperthreads):
OMPI_MCA_hwloc_base_binding_policy=core:overload
If you want to bind the procs to the hyperthreads (since one proc will
be bound to a hypterthread, no overloading will occur):
OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1
OMPI_MCA_hwloc_base_binding_policy=hwthread
Thanks, that's all very useful. One more question: how far back in ompi
versions do the relevant mpiexec flags go?
I ask because the (econometrics) program I work on has a facility for
semi-automating use of MPI, which includes formulating a suitable mpiexec
call on behalf of the user, and I'm wondering if --oversubscribe and/or
--use-hwthread-cpus will "just work", or might choke earlier versions of
mpiexec.
Allin Cottrell