[OMPI users] Update/patch to check/opal_check_pmi.m4

2014-10-06 Thread Timothy Brown
Hi,

I’m not too sure if this is the right list, or if I should be posting to the 
dev list. Please let me know if I’m in the wrong.

We use SLURM (14.03.7) and have been trying to get OpenMPI (1.8.3) to work with 
`srun`. It seems that the M4 file to check for PMI doesn’t work in our 
situation. Where we have both a lib64 and lib directory within SLURM. The lib64 
directory only contains perl modules, while the lib directory contains the PMI 
library.

By changing the M4 AS_IF macro in config/opal_check_pmi.m4 to check for the 
library .so and to have an else if test. The configuration script finds the 
library. Which means OpenMPI builds with PMI support and now we have 
- srun
- mpirun
- mpiexec
all working properly.

I have created a patch against the git master and it’s attached.

Regards
Timothy




0001-Updating-the-PMI-check.patch
Description: 0001-Updating-the-PMI-check.patch


Re: [OMPI users] Update/patch to check/opal_check_pmi.m4

2014-10-06 Thread Timothy Brown
Yes, I know. Sorry I might not have articulated myself fully earlier.

Currently if I run configure as:

$ ./configure --prefix=/curc/tools/x_86_64/rh6/openmpi/1.8.3/intel/13.0.0 \
  --with-threads=posix --enable-mpi-thread-multiple \
  --with-pmi=/curc/slurm/slurm/current/ --with-slurm

I get the following error:

--- MCA component common:pmi (m4 configuration macro)
checking for MCA component common:pmi compile mode... dso
checking if user requested PMI support... yes
checking if PMI or PMI2 headers installed... Slurm PMI headers found
checking for PMI2_Init in -lpmi2... no
checking for PMI2_Init in -lpmi... no
checking for PMI_Init in -lpmi... no
checking PMI2 and/or PMI support enabled... no
configure: WARNING: PMI support requested (via --with-pmi) but not found.
configure: error: Aborting.

As the test in config/opal_check_pmi.m4 contains:

[AS_IF([test -d "$with_pmi/lib64"],
[opal_check_pmi_$1_LDFLAGS="-L$with_pmi/lib64"
 opal_pmi_rpath="$with_pmi/lib64"],
[opal_check_pmi_$1_LDFLAGS="-L$with_pmi/lib"
 opal_pmi_rpath="$with_pmi/lib”])

And in our SLURM installation directory:

$ ls /curc/slurm/slurm/current/lib64/
perl5
$ ls /curc/slurm/slurm/current/lib/
libpmi.a   libpmi.so.0  libslurmdb.a   libslurmdb.so.27  libslurm.so
 slurm
libpmi.la  libpmi.so.0.0.0  libslurmdb.la  libslurmdb.so.27.0.0  libslurm.so.27
libpmi.so  libslurm.a   libslurmdb.so  libslurm.la   
libslurm.so.27.0.0

So the patch I am providing checks for the actual libpmi.so file, by
1) replacing the test -d with a test -f 
2) appending the file we are looking for (libpmi.so)

You do bring up an interesting point, I didn’t think of. If it is checking for 
libpmi2.so, that can be accounted for by adding another 2 test and run-if-true 
results to the AS_IF macro. If you deem my patch worthwhile, I am happy to 
modify it to meet this criteria.

Regards
Timothy


On Oct 6, 2014, at 6:07 PM, Joshua Ladd  wrote:

> We only link in libpmi(2).so if specifically requested to do so via 
> "--with-pmi" configure flag. It is not automatic. 
> 
> Josh
> 
> On Mon, Oct 6, 2014 at 3:28 PM, Timothy Brown  
> wrote:
> Hi,
> 
> I’m not too sure if this is the right list, or if I should be posting to the 
> dev list. Please let me know if I’m in the wrong.
> 
> We use SLURM (14.03.7) and have been trying to get OpenMPI (1.8.3) to work 
> with `srun`. It seems that the M4 file to check for PMI doesn’t work in our 
> situation. Where we have both a lib64 and lib directory within SLURM. The 
> lib64 directory only contains perl modules, while the lib directory contains 
> the PMI library.
> 
> By changing the M4 AS_IF macro in config/opal_check_pmi.m4 to check for the 
> library .so and to have an else if test. The configuration script finds the 
> library. Which means OpenMPI builds with PMI support and now we have
> - srun
> - mpirun
> - mpiexec
> all working properly.
> 
> I have created a patch against the git master and it’s attached.
> 
> Regards
> Timothy
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25467.php
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/10/25469.php



Re: [OMPI users] Update/patch to check/opal_check_pmi.m4

2014-10-07 Thread Timothy Brown
Ralph,

Thanks for the patch. It cleaned up the pmi check nicely.

Applied, configured and compiled without any problems! Great!

The configure gave me:
--- MCA component pubsub:pmi (m4 configuration macro)
checking for MCA component pubsub:pmi compile mode... dso
checking if user requested PMI support... yes
checking if PMI installed... yes
checking final added libraries... -lpmi
checking if MCA component pubsub:pmi can compile…yes

Regards
Timothy


On Oct 7, 2014, at 9:39 AM, Ralph Castain  wrote:

> I've poked at this a bit and think I have all the combinations covered - can 
> you try the attached patch? I don't have a way to test it right now, so I 
> don't want to put it in the trunk.
> 
> Thanks
> Ralph
> 
> 
> On Mon, Oct 6, 2014 at 6:02 PM, Ralph Castain  wrote:
> I've looked at your patch, and it isn't quite right as it only looks for 
> libpmi and not libpmi2. We need to look for each of them as we could have 
> either or both.
> 
> I'll poke a bit at this tonight and see if I can make this a little simpler - 
> the nesting is getting a little deep.
> 
> 
> On Mon, Oct 6, 2014 at 5:33 PM, Timothy Brown  
> wrote:
> Yes, I know. Sorry I might not have articulated myself fully earlier.
> 
> Currently if I run configure as:
> 
> $ ./configure --prefix=/curc/tools/x_86_64/rh6/openmpi/1.8.3/intel/13.0.0 \
>   --with-threads=posix --enable-mpi-thread-multiple \
>   --with-pmi=/curc/slurm/slurm/current/ --with-slurm
> 
> I get the following error:
> 
> --- MCA component common:pmi (m4 configuration macro)
> checking for MCA component common:pmi compile mode... dso
> checking if user requested PMI support... yes
> checking if PMI or PMI2 headers installed... Slurm PMI headers found
> checking for PMI2_Init in -lpmi2... no
> checking for PMI2_Init in -lpmi... no
> checking for PMI_Init in -lpmi... no
> checking PMI2 and/or PMI support enabled... no
> configure: WARNING: PMI support requested (via --with-pmi) but not found.
> configure: error: Aborting.
> 
> As the test in config/opal_check_pmi.m4 contains:
> 
> [AS_IF([test -d "$with_pmi/lib64"],
> [opal_check_pmi_$1_LDFLAGS="-L$with_pmi/lib64"
>  opal_pmi_rpath="$with_pmi/lib64"],
> [opal_check_pmi_$1_LDFLAGS="-L$with_pmi/lib"
>  opal_pmi_rpath="$with_pmi/lib”])
> 
> And in our SLURM installation directory:
> 
> $ ls /curc/slurm/slurm/current/lib64/
> perl5
> $ ls /curc/slurm/slurm/current/lib/
> libpmi.a   libpmi.so.0  libslurmdb.a   libslurmdb.so.27  libslurm.so  
>slurm
> libpmi.la  libpmi.so.0.0.0  libslurmdb.la  libslurmdb.so.27.0.0  
> libslurm.so.27
> libpmi.so  libslurm.a   libslurmdb.so  libslurm.la   
> libslurm.so.27.0.0
> 
> So the patch I am providing checks for the actual libpmi.so file, by
> 1) replacing the test -d with a test -f
> 2) appending the file we are looking for (libpmi.so)
> 
> You do bring up an interesting point, I didn’t think of. If it is checking 
> for libpmi2.so, that can be accounted for by adding another 2 test and 
> run-if-true results to the AS_IF macro. If you deem my patch worthwhile, I am 
> happy to modify it to meet this criteria.
> 
> Regards
> Timothy
> 
> 
> On Oct 6, 2014, at 6:07 PM, Joshua Ladd  wrote:
> 
> > We only link in libpmi(2).so if specifically requested to do so via 
> > "--with-pmi" configure flag. It is not automatic.
> >
> > Josh
> >
> > On Mon, Oct 6, 2014 at 3:28 PM, Timothy Brown 
> >  wrote:
> > Hi,
> >
> > I’m not too sure if this is the right list, or if I should be posting to 
> > the dev list. Please let me know if I’m in the wrong.
> >
> > We use SLURM (14.03.7) and have been trying to get OpenMPI (1.8.3) to work 
> > with `srun`. It seems that the M4 file to check for PMI doesn’t work in our 
> > situation. Where we have both a lib64 and lib directory within SLURM. The 
> > lib64 directory only contains perl modules, while the lib directory 
> > contains the PMI library.
> >
> > By changing the M4 AS_IF macro in config/opal_check_pmi.m4 to check for the 
> > library .so and to have an else if test. The configuration script finds the 
> > library. Which means OpenMPI builds with PMI support and now we have
> > - srun
> > - mpirun
> > - mpiexec
> > all working properly.
> >
> > I have created a patch against the git master and it’s attached.
> >
> > Regards
> > Timothy
> >
> >
> >
> > ___

Re: [OMPI users] Noob installation problem

2014-11-24 Thread Timothy Brown

> On Nov 24, 2014, at 3:45 AM, Wildes Andrew  wrote:
> 
> Hello,
> 
> I've been trying to install OpenMPI (v. 1.8.3) on my mac (OS 10.6.8).  I have 
> gcc in my path (v. 4.6.0).  The ./configure routine finds it, but says that 
> it doesn't work.
> 
> Looking through config.log (attached), I see that it's trying to access 
> 'conftest.c'.  This file isn't found (it doesn't seem to be in the openmpi 
> compressed file, nor is it anywhere to be found elsewhere on my mac), and I 
> suspect that it's at this point that the compilation attempt terminates.
> 
> I'm sorry to bother you with what is probably a trivial problem.  Any help 
> would be greatly appreciated.


In looking through your log, there are a couple of things.

Firstly, it complains that gcc can not create executables. Line 16:

configure: error: C compiler cannot create executables

Secondly, line 114:

gcc: error trying to exec 'as': execvp: No such file or directory

I'm thinking the first error is from stderr/stdout, while your config.log 
starts at line 25. Yes?

So on OS X have you installed Xcode and all it's dependencies? Are you using 
MacPorts, HomeBrew or Fink? Or any other package manager?
Are you able to compile anything? A simple (non-mpi) hello world or some such?

The program `as` is an assembler (translates assembly code to object code). 

Hope this helps.

Timothy

Re: [OMPI users] Openmpi compilation errors

2015-05-29 Thread Timothy Brown

> On May 29, 2015, at 5:07 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> On May 29, 2015, at 6:54 AM, Bruno Queiros  wrote:
> 
>> The name of the binary is correct: pgf90 the name of the file is also 
>> correct .pgf90.rc i do have some doubts about the content of the file. Is 
>> this enough?
>> 
>> switch -pthread is replace(-lpthread) positional(linker)
> 
> I'm not a Portland customer -- I don't know.  You'll need to check their 
> documentation.
> 

Here I have a siterc file within the PGI bin directory, for example:

/some/long/path/pgi/15.3/bin/siterc

I have exactly the same line as you have specified. 

If you are unable to put it in the PGI installation bin directory you can put 
it in a file ${HOME}/.mypgf90rc, as is described in section 1.8.2 (page 14) of 
the PGI Compilers Users Guide ( http://www.pgroup.com/doc/pgiug.pdf ).
 

>> If i do a source .pgf90.rc i do get errors:
>> 
>> -bash: ./.pgf90.rc: line 1: syntax error near unexpected token `('
>> -bash: ./.pgf90.rc: line 1: `switch -pthread is replace(-lpthread) 
>> positional(linker)'
> 
> I'm guessing that this file is not intended to be sourced by the shell, but 
> rather noticed and read/used by the pgf90 compiler when it is invoked.
> 

Jeff, your right. It's not for your shell to source it'd for the compiler to 
read.


> Sidenote: isn't there a pgfortran compiler executable that is supposed to be 
> preferred over "pgf90" these days?  (remember my disclaimer: I'm not a 
> Portland customer, so I could be totally off base here...)  Have you tried 
> pgfortran to see if it accepts the -pthread option?  Sometimes the different 
> compiler executable entry points behave slightly differently...
> 

I've built Openmpi 1.8.5 with the following configure line:

./configure  \
  --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/pgi/15.3 \
  --with-threads=posix \
  --enable-mpi-thread-multiple \
  --with-slurm \
  --with-pmi=/curc/slurm/slurm/current/

Please note, I am using the following environment variables:
CC=pgcc
FC=pgfortran
F90=pgf90
F77=pgf77
CXX=pgc++

I do not use pgprepro for CPP as I found it to be flaky.

Hope this helps.
Timothy

Re: [OMPI users] Openmpi compilation errors

2015-05-30 Thread Timothy Brown

> On May 30, 2015, at 4:34 AM, Jeff Squyres (jsquyres)  
> wrote:
> 
> On May 29, 2015, at 11:19 AM, Timothy Brown  
> wrote:
>> 
>> I've built Openmpi 1.8.5 with the following configure line:
>> 
>> ./configure  \
>> --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/pgi/15.3 \
>> --with-threads=posix \
>> --enable-mpi-thread-multiple \
>> --with-slurm \
>> --with-pmi=/curc/slurm/slurm/current/
>> 
>> Please note, I am using the following environment variables:
>> CC=pgcc
>> FC=pgfortran
>> F90=pgf90
>> F77=pgf77
>> CXX=pgc++
> 
> Sweet -- thanks for the info, Tim.
> 
> One extremely minor tweak that I would recommend is to do this, instead:
> 
> ./configure  \
> CC=pgcc \
> FC=pgfortran \
> F90=pgf90 \
> F77=pgf77 \
> CXX=pgc++ \
> --prefix=/curc/tools/x86_64/rh6/software/openmpi/1.8.5/pgi/15.3 \
> --with-threads=posix \
> --enable-mpi-thread-multiple \
> --with-slurm \
> --with-pmi=/curc/slurm/slurm/current/
> 
> I.e., set those environment variables on the configure command line instead 
> of having them in your environment.
> 
> The end effect is exactly the same -- the only difference is that these 
> environment variables will be explicitly listed right at the top in the 
> config.log file that is generated when you run configure.  It's a very minor 
> thing -- just for helping your future self when remembering exactly how your 
> copy of Open MPI was built.


Hi Jeff,

Yes, I have often gone back to a build and tried to figure out how it was 
actually built. We're investigating using EasyBuild ( 
https://github.com/hpcugent/easybuild ) however we haven't taken the plunge yet.

We do use Lmod ( https://github.com/TACC/Lmod ) and so I have modules loaded 
into our environment.

All this being said, I wrote a quick and dirty script that logs the modules 
loaded and the configure line from the config.log to a DB. The script also 
queries the DB and can produce the configure command (including module loads) 
and update the prefix based upon the source directory your in (it only does 
this for patch version updates only (i.e 1.8.4 -> 1.8.5), I decided anything 
larger than this you really should read the docs/configure options to see if 
your doing things correctly). It's an ugly script that I'm not proud of but it 
works!

Regards
Tim




Re: [OMPI users] difference between OPENMPI e Intel MPI (DATATYPE)

2015-09-03 Thread Timothy Brown
Hi Diego,

I think the Intel HPC forum comment is about using environment modules to 
manage your environment (PATH, LD_LIBRARY_PATH variables).

Most HPC systems use environment modules:
- Tcl ( http://modules.cvs.sourceforge.net/viewvc/modules/modules/tcl/ )
- C/Tcl ( http://sourceforge.net/project/showfiles.php?group_id=15538 )
- Lmod  ( https://www.tacc.utexas.edu/research-development/tacc-projects/lmod )

If your system has environment modules, you'd typically.
- load a compiler (Intel, GCC, PGI, etc).
- load a MPI built with that compiler (Intel MPI, OpenMPI).

The most important thing here is to have a software stack that is consistent. 
That is built with the same compiler. For example GCC with OpenMPI to build and 
execute your program. While not GCC and OpenMPI to build then Intel and Intel 
MPI to execute.

Regards


> On Sep 3, 2015, at 8:43 AM, Diego Avesani  wrote:
> 
> Dear Jeff, Dear all,
> I normaly use "USE MPI"
> 
> This is the answar fro intel HPC forum:
> 
> If you are switching between intel and openmpi you must remember not to mix 
> environment.  You might use modules to manage this.  As the data types 
> encodings differ, you must take care that all objects are built against the 
> same headers.
> 
> Could someone explain me what are these modules and how I can use them?
> 
> Thanks
> 
> Diego
> 
> Diego
> 
> 
> On 2 September 2015 at 19:07, Jeff Squyres (jsquyres)  
> wrote:
> Can you reproduce the error in a small example?
> 
> Also, try using "use mpi" instead of "include 'mpif.h'", and see if that 
> turns up any errors.
> 
> 
> > On Sep 2, 2015, at 12:13 PM, Diego Avesani  wrote:
> >
> > Dear Gilles, Dear all,
> > I have found the error. Some CPU has no element to share. It was a my error.
> >
> > Now I have another one:
> >
> > Fatal error in MPI_Isend: Invalid communicator, error stack:
> > MPI_Isend(158): MPI_Isend(buf=0x137b7b4, count=1, INVALID DATATYPE, dest=0, 
> > tag=0, comm=0x0, request=0x7fffe8726fc0) failed
> >
> > In this case with MPI does not work, with openMPI it works.
> >
> > Could you see some particular information from the error message?
> >
> > Diego
> >
> >
> > Diego
> >
> >
> > On 2 September 2015 at 14:52, Gilles Gouaillardet 
> >  wrote:
> > Diego,
> >
> > about MPI_Allreduce, you should use MPI_IN_PLACE if you want the same 
> > buffer in send and recv
> >
> > about the stack, I notice comm is NULL which is a bit surprising...
> > at first glance, type creation looks good.
> > that being said, you do not check MPIdata%iErr is MPI_SUCCESS after each 
> > MPI call.
> > I recommend you first do this, so you can catch the error as soon it 
> > happens, and hopefully understand why it occurs.
> >
> > Cheers,
> >
> > Gilles
> >
> >
> > On Wednesday, September 2, 2015, Diego Avesani  
> > wrote:
> > Dear all,
> >
> > I have notice small difference between OPEN-MPI and intel MPI.
> > For example in MPI_ALLREDUCE in intel MPI is not allowed to use the same 
> > variable in send and receiving Buff.
> >
> > I have written my code in OPEN-MPI, but unfortunately I have to run in on a 
> > intel-MPI cluster.
> > Now I have the following error:
> >
> > atal error in MPI_Isend: Invalid communicator, error stack:
> > MPI_Isend(158): MPI_Isend(buf=0x1dd27b0, count=1, INVALID DATATYPE, dest=0, 
> > tag=0, comm=0x0, request=0x7fff9d7dd9f0) failed
> >
> >
> > This is ho I create my type:
> >
> >   CALL  MPI_TYPE_VECTOR(1, Ncoeff_MLS, Ncoeff_MLS, MPI_DOUBLE_PRECISION, 
> > coltype, MPIdata%iErr)
> >   CALL  MPI_TYPE_COMMIT(coltype, MPIdata%iErr)
> >   !
> >   CALL  MPI_TYPE_VECTOR(1, nVar, nVar, coltype, MPI_WENO_TYPE, MPIdata%iErr)
> >   CALL  MPI_TYPE_COMMIT(MPI_WENO_TYPE, MPIdata%iErr)
> >
> >
> > do you believe that is here the problem?
> > Is also this the way how intel MPI create a datatype?
> >
> > maybe I could also ask to intel MPI users
> > What do you think?
> >
> > Diego
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2015/09/27523.php
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post: 
> > http://www.open-mpi.org/community/lists/users/2015/09/27524.php
> 
> 
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to: 
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/09/27525.php
> 
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: