Re: [OMPI users] new overcommitment warning?

2014-09-06 Thread Allin Cottrell

On Fri, 5 Sep 2014, Ralph Castain wrote:


On Sep 5, 2014, at 3:34 PM, Allin Cottrell  wrote:

I suspect there is a new (to openmpi 1.8.N?) warning with respect to 
requesting a number of MPI processes greater than the number of "real" 
cores on a given machine. I can provide a good deal more information is 
that's required, but can I just pose it as a question for now? Does 
anyone know of a a relevant change in the code?


The reason I'm asking is that I've been experimenting, on a couple of 
machines and with more than one computational problem, to see if I'm 
better off restricting the number of MPI processes to the number of 
"real" or "physical" cores available, or if it's better to allow a 
larger number of processes up to the number of hyperthreads available 
(which is twice the number of cores on the machines I'm working on).


If you are going to treat hyperthreads as independent processors, then 
you should probably set the --use-hwthreads-as-cpus flag so OMPI knows 
to treat it that way


Hmm, where would I set that? (For reference) mpiexec --version gives

mpiexec (OpenRTE) 1.8.2

and if I append --use-hwthreads-as-cpus to my mpiexec command I get

mpiexec: Error: unknown option "--use-hwthreads-as-cpus"

However, via trial and error I've found that these options work: either

--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)

What's puzzling me, though, is that the use of these flags was not 
necessary when, earlier this year, I was running ompi 1.6.5. Neither is it 
necessary when running ompi 1.7.3 on a different machine. The warning 
that's printed without these flags seems to be new.


It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and 
non-user friendly) warning whenever I specify to mpiexec a number of 
processes > the number of "real" cores [...]


Could you pass along the warning? It should only give you a warning if 
the #procs > #slots as you are then oversubscribed. You can turn that 
warning off by just add the oversubscribe flag to your mapping directive


Here's what I'm seeing:


A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to: CORE
   Node:waverley
   #processes:  2
   #cpus:   1

You can override this protection by adding the "overload-allowed"
option to your binding directive.


The machine in question has two cores and four threads. The thing that's 
confusing here is that I'm not aware of supplying any "binding directive": 
my command line (for running on a single host) is just this:


mpiexec -np   

In fact it seems that current ompi "does the right thing" in respect of 
the division of labor even without the extra flags: depending on the 
nature of computation, I can get faster times with -np 4 than with -np 2 
(and not degradation). It just insists on printing this warning which I'd 
like to be able to turn off "globally" if possible.


Allin Cottrell


Re: [OMPI users] compilation problem with ifort

2014-09-06 Thread Elio Physics
Hello Gus,
Thanks once again for your help and guidance. I just want to let you know that 
I was able to resolve the problem; I actually changed the locations of the 
libraries before configuring; for example I set:
BLAS_LIBS  =   -lmkl_intel_lp64  -lmkl_sequential -lmkl_coreLAPACK_LIBS
= 
/opt/intel/Compiler/11.1/069/mkl/lib/em64t/libmkl_lapack95_lp64.aFFT_LIBS=-L/opt/scilibs/FFTW/fftw-3.2.1_Bull.9005/lib
 -lfftw3
and also to link espresso libs with the plugins, i added the following line:
TOPDIR=/home_cluster/fis718/./espresso-4.0.3/
The compilation went with no errors and epw.x was produced.. Once again, thanks 
for your support and help.
Regards
ELIO MOUJAES



> Date: Thu, 4 Sep 2014 12:48:44 -0400
> From: g...@ldeo.columbia.edu
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] compilation problem with ifort
> 
> Hi Elie
> 
> The executable generated in my computer will be useless to you,
> because these days most if not all libraries linked to an executable are
> dynamic/shared libraries.
> You won't have the same in your computer, or the equivalent will be
> located in different places, may be from different versions, etc.
> (E.g. your Intel compiler libraries will be from a different version,
> in a different location, and likewise for OpenMPI libraries etc.)
> Take any executable that you may have in your computer and do "ldd 
> exectuable_name" to see the list of shared libraries.
> 
> The error you reported suggests a misconfiguration of Makefiles,
> or better, a mispositioning of directories.
> 
> **
> 
> First thing I would try is to start fresh.
> Delete or move the old directory trees,
> download everything again on blank directories,
> and do the recipe all over again.
> Leftovers of previous compilations are often a hurdle,
> so you do yourself a favor by starting over from scratch.
> 
> **
> Second *really important* item to check:
> 
> The top directories of QE and EPW *must* follow this hierarchy:
> 
> espresso-4.0.3
> |-- EPW-3.0.0
> 
> Is this what you have?
> The EPW web site just hints this in their recipe step 3.
> The Makefiles will NOT work if this directory hierarchy is incorrect.
> 
> The error you reported in your first email *suggests* that the Makefiles
> in the EPW tarball are not finding the Makefiles in the QE tarball,
> which indicates that the the directories may not have a correct relative 
> location.
> 
> I.e. the EPW top directory must be right under the QE top directory.
> 
> **
> 
> Third thing, is that you have to follow the recipe strictly (and on
> the EPW web site there seems to be typos and omissions):
> 
> 1) untar the QE tarball:
> 
> tar -zxf espresso-4.0.3.tar.gz
> 
> 2) move the EPW tarball to the QE top directory produced by step 1 
> above, something like this:
> 
> mv EPW-3.0.0.tar.gz espresso-4.0.3
> 
> 3) untar the EPW tarball you copied/moved in step 2 above,
> something like this:
> 
> cd espresso-4.0.3
> tar -zxf  EPW-3.0.0.tar.gz
> 
> 4) Set up your OpenMPI environment (assuming you are using OpenMPI
> and that it is not installed in a standard location such as /usr/local):
> 
> 
> [bash/sh]
> export PATH=/your/openmpi/bin:$PATH
> export LD_LIBRARY_PATH=/your/openmpi/lib:$LD_LIBRARY_PATH
> 
> [tcsh/csh]
> setenv PATH /your/openmpi/bin:$PATH
> setenv LD_LIBRARY_PATH /your/openmpi/lib:$LD_LIBRARY_PATH
> 
> 5) configure espresso-4.0.3, i.e, assuming you already are in the
> espresso-4.0.3, do:
> 
> ./configure CC=icc F77=ifort
> 
> (assuming you are using Intel compilers, and that you compiled OpenMPI 
> with them, if you did
> not, say, if you used gcc and gfortran, use CC=gcc FC=gfortran instead)
> 
> 6) Run "make" on the top EPW directory:
> 
> cd EPW-3.0.0
> make
> 
> When you configure QE it doesn't compile anything.
> It just generates/sets up a bunch of Makefiles in the QE directory tree.
> 
> When you do "make" on the EPW-3.0.0 directory the top Makefile just
> says (cd src; make).
> If you look into the "src" subdirectory you will see that the Makefile
> therein points to library and include directories two levels above,
> which means that they are in the *QE* directory tree:
> 
> *
> IFLAGS   = -I../../include
> MODFLAGS = -I./ -I../../Modules -I../../iotk/src \
> -I../../PW -I../../PH -I../../PP
> LIBOBJS  = ../../flib/ptools.a ../../flib/flib.a \
> ../../clib/clib.a ../../iotk/src/libiotk.a
> W90LIB   = ../../W90/libwannier.a
> **
> 
> Hence, if your QE directory is not immediately above your EPW directory
> everything will fail, because the EPW Makefile won't be able to find
> the bits and parts of QE that it needs.
> And this is *exactly what the error message in your first email showed*,
> a bunch of object files that were not found.
> 
> ***
> 
> Sorry, but I cannot do any better than this.
> I hope this helps,
> Gus Correa
> 
> On 09/03/2014 08:59 PM, Elio Physics wrote:
> > Ray and Gus,
> >
> > Thanks a lot for your help. I followed Gus' steps. I still have the same
> > proble

Re: [OMPI users] new overcommitment warning?

2014-09-06 Thread Ralph Castain

On Sep 6, 2014, at 7:52 AM, Allin Cottrell  wrote:

> On Fri, 5 Sep 2014, Ralph Castain wrote:
> 
>> On Sep 5, 2014, at 3:34 PM, Allin Cottrell  wrote:
>> 
>>> I suspect there is a new (to openmpi 1.8.N?) warning with respect to 
>>> requesting a number of MPI processes greater than the number of "real" 
>>> cores on a given machine. I can provide a good deal more information is 
>>> that's required, but can I just pose it as a question for now? Does anyone 
>>> know of a a relevant change in the code?
>>> 
>>> The reason I'm asking is that I've been experimenting, on a couple of 
>>> machines and with more than one computational problem, to see if I'm better 
>>> off restricting the number of MPI processes to the number of "real" or 
>>> "physical" cores available, or if it's better to allow a larger number of 
>>> processes up to the number of hyperthreads available (which is twice the 
>>> number of cores on the machines I'm working on).
>> 
>> If you are going to treat hyperthreads as independent processors, then you 
>> should probably set the --use-hwthreads-as-cpus flag so OMPI knows to treat 
>> it that way
> 
> Hmm, where would I set that? (For reference) mpiexec --version gives
> 
> mpiexec (OpenRTE) 1.8.2
> 
> and if I append --use-hwthreads-as-cpus to my mpiexec command I get
> 
> mpiexec: Error: unknown option "--use-hwthreads-as-cpus"
> 
> However, via trial and error I've found that these options work: either
> 
> --map-by hwthread OR
> --oversubscribe (not mentioned in the mpiexec man page)

My apologies - the correct spelling is  --use-hwthread-cpus

> 
> What's puzzling me, though, is that the use of these flags was not necessary 
> when, earlier this year, I was running ompi 1.6.5. Neither is it necessary 
> when running ompi 1.7.3 on a different machine. The warning that's printed 
> without these flags seems to be new.

The binding code changed during the course of the 1.7 series to provide more 
fine-controlled options

> 
>>> It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and 
>>> non-user friendly) warning whenever I specify to mpiexec a number of 
>>> processes > the number of "real" cores [...]
>> 
>> Could you pass along the warning? It should only give you a warning if the 
>> #procs > #slots as you are then oversubscribed. You can turn that warning 
>> off by just add the oversubscribe flag to your mapping directive
> 
> Here's what I'm seeing:
> 
> 
> A request was made to bind to that would result in binding more
> processes than cpus on a resource:
> 
>   Bind to: CORE
>   Node:waverley
>   #processes:  2
>   #cpus:   1
> 
> You can override this protection by adding the "overload-allowed"
> option to your binding directive.
> 
> 
> The machine in question has two cores and four threads. The thing that's 
> confusing here is that I'm not aware of supplying any "binding directive": my 
> command line (for running on a single host) is just this:
> 
> mpiexec -np   
> 
> In fact it seems that current ompi "does the right thing" in respect of the 
> division of labor even without the extra flags: depending on the nature of 
> computation, I can get faster times with -np 4 than with -np 2 (and not 
> degradation). It just insists on printing this warning which I'd like to be 
> able to turn off "globally" if possible.

You shouldn't be getting that warning if you aren't specifying a binding 
option, so it looks like a bug to me. I'll check and see what's going on. You 
might want to check, however, that you don't have a binding directive hidden in 
your environment or default MCA param file.

Meantime, just use the oversubscribe or overload-allowed options to turn it 
off. You can put those in the default MCA param file if you don't want to add 
it to the environment or cmd line. The MCA params would be:

OMPI_MCA_rmaps_base_oversubscribe=1

If you want to bind the procs to cores, but allow two procs to share the core 
(each will be bound to both hyperthreads):
OMPI_MCA_hwloc_base_binding_policy=core:overload

If you want to bind the procs to the hyperthreads (since one proc will be bound 
to a hypterthread, no overloading will occur):
OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1
OMPI_MCA_hwloc_base_binding_policy=hwthread

HTH
Ralph

> 
> Allin Cottrell
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25288.php



Re: [OMPI users] new overcommitment warning?

2014-09-06 Thread Allin Cottrell

On Sat, 6 Sep 2014, Ralph Castain wrote:


On Sep 6, 2014, at 7:52 AM, Allin Cottrell  wrote:


On Fri, 5 Sep 2014, Ralph Castain wrote:


On Sep 5, 2014, at 3:34 PM, Allin Cottrell  wrote:

I suspect there is a new (to openmpi 1.8.N?) warning with respect to 
requesting a number of MPI processes greater than the number of 
"real" cores on a given machine. [...]


If you are going to treat hyperthreads as independent processors, then 
you should probably set the --use-hwthreads-as-cpus flag so OMPI knows 
to treat it that way


Hmm, where would I set that? (For reference) mpiexec --version gives

mpiexec (OpenRTE) 1.8.2

and if I append --use-hwthreads-as-cpus to my mpiexec command I get

mpiexec: Error: unknown option "--use-hwthreads-as-cpus"

However, via trial and error I've found that these options work: either

--map-by hwthread OR
--oversubscribe (not mentioned in the mpiexec man page)


My apologies - the correct spelling is  --use-hwthread-cpus


OK, thanks.

What's puzzling me, though, is that the use of these flags was not 
necessary when, earlier this year, I was running ompi 1.6.5. Neither is 
it necessary when running ompi 1.7.3 on a different machine. The 
warning that's printed without these flags seems to be new.


The binding code changed during the course of the 1.7 series to provide 
more fine-controlled options


Again, thanks for the info.

It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure 
and non-user friendly) warning whenever I specify to mpiexec a number 
of processes > the number of "real" cores [...]


Could you pass along the warning? It should only give you a warning if 
the #procs > #slots as you are then oversubscribed. You can turn that 
warning off by just add the oversubscribe flag to your mapping 
directive


Here's what I'm seeing:


A request was made to bind to that would result in binding more
processes than cpus on a resource:

  Bind to: CORE
  Node:waverley
  #processes:  2
  #cpus:   1

You can override this protection by adding the "overload-allowed"
option to your binding directive.


The machine in question has two cores and four threads. The thing 
that's confusing here is that I'm not aware of supplying any "binding 
directive": my command line (for running on a single host) is just 
this:


mpiexec -np   

[...]


You shouldn't be getting that warning if you aren't specifying a binding 
option, so it looks like a bug to me. I'll check and see what's going 
on. You might want to check, however, that you don't have a binding 
directive hidden in your environment or default MCA param file.


I don't think that's the case: the only mca-params.conf file on my system 
is the default /etc/openmpi/openmpi-mca-params.conf installed by Arch, 
which is empty apart from comments, and "set | grep MCA" doesn't produce 
anything.


Meantime, just use the oversubscribe or overload-allowed options to turn 
it off. You can put those in the default MCA param file if you don't 
want to add it to the environment or cmd line. The MCA params would be:


OMPI_MCA_rmaps_base_oversubscribe=1

If you want to bind the procs to cores, but allow two procs to share the 
core (each will be bound to both hyperthreads): 
OMPI_MCA_hwloc_base_binding_policy=core:overload


If you want to bind the procs to the hyperthreads (since one proc will 
be bound to a hypterthread, no overloading will occur): 
OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 
OMPI_MCA_hwloc_base_binding_policy=hwthread


Thanks, that's all very useful. One more question: how far back in ompi 
versions do the relevant mpiexec flags go?


I ask because the (econometrics) program I work on has a facility for 
semi-automating use of MPI, which includes formulating a suitable mpiexec 
call on behalf of the user, and I'm wondering if --oversubscribe and/or 
--use-hwthread-cpus will "just work", or might choke earlier versions of

mpiexec.

Allin Cottrell


Re: [OMPI users] new overcommitment warning?

2014-09-06 Thread Ralph Castain

On Sep 6, 2014, at 11:00 AM, Allin Cottrell  wrote:

> On Sat, 6 Sep 2014, Ralph Castain wrote:
> 
>> On Sep 6, 2014, at 7:52 AM, Allin Cottrell  wrote:
>> 
>>> On Fri, 5 Sep 2014, Ralph Castain wrote:
>>> 
 On Sep 5, 2014, at 3:34 PM, Allin Cottrell  wrote:
 
> I suspect there is a new (to openmpi 1.8.N?) warning with respect to 
> requesting a number of MPI processes greater than the number of "real" 
> cores on a given machine. [...]
 
 If you are going to treat hyperthreads as independent processors, then you 
 should probably set the --use-hwthreads-as-cpus flag so OMPI knows to 
 treat it that way
>>> 
>>> Hmm, where would I set that? (For reference) mpiexec --version gives
>>> 
>>> mpiexec (OpenRTE) 1.8.2
>>> 
>>> and if I append --use-hwthreads-as-cpus to my mpiexec command I get
>>> 
>>> mpiexec: Error: unknown option "--use-hwthreads-as-cpus"
>>> 
>>> However, via trial and error I've found that these options work: either
>>> 
>>> --map-by hwthread OR
>>> --oversubscribe (not mentioned in the mpiexec man page)
>> 
>> My apologies - the correct spelling is  --use-hwthread-cpus
> 
> OK, thanks.
> 
>>> What's puzzling me, though, is that the use of these flags was not 
>>> necessary when, earlier this year, I was running ompi 1.6.5. Neither is it 
>>> necessary when running ompi 1.7.3 on a different machine. The warning 
>>> that's printed without these flags seems to be new.
>> 
>> The binding code changed during the course of the 1.7 series to provide more 
>> fine-controlled options
> 
> Again, thanks for the info.
> 
> It seems to me that openmpi >= 1.8 is giving me a (somewhat obscure and 
> non-user friendly) warning whenever I specify to mpiexec a number of 
> processes > the number of "real" cores [...]
 
 Could you pass along the warning? It should only give you a warning if the 
 #procs > #slots as you are then oversubscribed. You can turn that warning 
 off by just add the oversubscribe flag to your mapping directive
>>> 
>>> Here's what I'm seeing:
>>> 
>>> 
>>> A request was made to bind to that would result in binding more
>>> processes than cpus on a resource:
>>> 
>>>  Bind to: CORE
>>>  Node:waverley
>>>  #processes:  2
>>>  #cpus:   1
>>> 
>>> You can override this protection by adding the "overload-allowed"
>>> option to your binding directive.
>>> 
>>> 
>>> The machine in question has two cores and four threads. The thing that's 
>>> confusing here is that I'm not aware of supplying any "binding directive": 
>>> my command line (for running on a single host) is just this:
>>> 
>>> mpiexec -np   
>>> 
>>> [...]
>> 
>> You shouldn't be getting that warning if you aren't specifying a binding 
>> option, so it looks like a bug to me. I'll check and see what's going on. 
>> You might want to check, however, that you don't have a binding directive 
>> hidden in your environment or default MCA param file.
> 
> I don't think that's the case: the only mca-params.conf file on my system is 
> the default /etc/openmpi/openmpi-mca-params.conf installed by Arch, which is 
> empty apart from comments, and "set | grep MCA" doesn't produce anything.

Okay - I've replicated the bug here, so I'll address it for 1.8.3. Thanks for 
letting me know about it!

> 
>> Meantime, just use the oversubscribe or overload-allowed options to turn it 
>> off. You can put those in the default MCA param file if you don't want to 
>> add it to the environment or cmd line. The MCA params would be:
>> 
>> OMPI_MCA_rmaps_base_oversubscribe=1
>> 
>> If you want to bind the procs to cores, but allow two procs to share the 
>> core (each will be bound to both hyperthreads): 
>> OMPI_MCA_hwloc_base_binding_policy=core:overload
>> 
>> If you want to bind the procs to the hyperthreads (since one proc will be 
>> bound to a hypterthread, no overloading will occur): 
>> OMPI_MCA_hwloc_base_use_hwthreads_as_cpus=1 
>> OMPI_MCA_hwloc_base_binding_policy=hwthread
> 
> Thanks, that's all very useful. One more question: how far back in ompi 
> versions do the relevant mpiexec flags go?
> 
> I ask because the (econometrics) program I work on has a facility for 
> semi-automating use of MPI, which includes formulating a suitable mpiexec 
> call on behalf of the user, and I'm wondering if --oversubscribe and/or 
> --use-hwthread-cpus will "just work", or might choke earlier versions of
> mpiexec.

At least 1.7.4 for the hwthread-cpus - maybe a little further back then that, 
but definitely not into the 1.6 series.

The -oversubscribe flag goes all the way back to the beginning release.

> 
> Allin Cottrell
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/09/25291.php