On Sat, 6 Sep 2014, Ralph Castain wrote:

On Sep 6, 2014, at 7:52 AM, Allin Cottrell <cottr...@wfu.edu> wrote:

On Fri, 5 Sep 2014, Ralph Castain wrote:

On Sep 5, 2014, at 3:34 PM, Allin Cottrell <cottr...@wfu.edu> wrote:

I suspect there is a new (to openmpi 1.8.N?) warning with respect to requesting a number of MPI processes greater than the number of "real" cores on a given machine. [...]

If you are going to treat hyperthreads as independent processors, then you should probably set the --use-hwthreads-as-cpus flag so OMPI knows to treat it that way

Hmm, where would I set that? (For reference) mpiexec --version gives

mpiexec (OpenRTE) 1.8.2

and if I append --use-hwthreads-as-cpus to my mpiexec command I get

mpiexec: Error: unknown option "--use-hwthreads-as-cpus"

However, via trial and error I've found that these options work: either

--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)

My apologies - the correct spelling is  --use-hwthread-cpus

OK, thanks.

What's puzzling me, though, is that the use of these flags was not necessary when, earlier this year, I was running ompi 1.6.5. Neither is it necessary when running ompi 1.7.3 on a different machine. The warning that's printed without these flags seems to be new.

The binding code changed during the course of the 1.7 series to provide more fine-controlled options

Again, thanks for the info.

It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and non-user friendly) warning whenever I specify to mpiexec a number of processes > the number of "real" cores [...]

Could you pass along the warning? It should only give you a warning if the #procs > #slots as you are then oversubscribed. You can turn that warning off by just add the oversubscribe flag to your mapping directive

Here's what I'm seeing:

<quote>
A request was made to bind to that would result in binding more
processes than cpus on a resource:

  Bind to:     CORE
  Node:        waverley
  #processes:  2
  #cpus:       1

You can override this protection by adding the "overload-allowed"
option to your binding directive.
</quote>

The machine in question has two cores and four threads. The thing that's confusing here is that I'm not aware of supplying any "binding directive": my command line (for running on a single host) is just this:

mpiexec -np <N> <myprogram> <myprogram-data>

[...]

You shouldn't be getting that warning if you aren't specifying a binding option, so it looks like a bug to me. I'll check and see what's going on. You might want to check, however, that you don't have a binding directive hidden in your environment or default MCA param file.

I don't think that's the case: the only mca-params.conf file on my system is the default /etc/openmpi/openmpi-mca-params.conf installed by Arch, which is empty apart from comments, and "set | grep MCA" doesn't produce anything.

Meantime, just use the oversubscribe or overload-allowed options to turn it off. You can put those in the default MCA param file if you don't want to add it to the environment or cmd line. The MCA params would be:

OMPI_MCA_rmaps_base_oversubscribe=1

If you want to bind the procs to cores, but allow two procs to share the core (each will be bound to both hyperthreads): OMPI_MCA_hwloc_base_binding_policy=core:overload

If you want to bind the procs to the hyperthreads (since one proc will be bound to a hypterthread, no overloading will occur): OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 OMPI_MCA_hwloc_base_binding_policy=hwthread

Thanks, that's all very useful. One more question: how far back in ompi versions do the relevant mpiexec flags go?

I ask because the (econometrics) program I work on has a facility for semi-automating use of MPI, which includes formulating a suitable mpiexec call on behalf of the user, and I'm wondering if --oversubscribe and/or --use-hwthread-cpus will "just work", or might choke earlier versions of
mpiexec.

Allin Cottrell

Reply via email to