Hi Jim,

I have quite a bit experience with compiling openmpi for dirac.
Here is what I use to configure openmpi:

./configure --prefix=$instdir \
            --disable-silent-rules \
            --enable-mpirun-prefix-by-default \
            --with-threads=posix \
            --enable-cxx-exceptions \
            --with-tm=$torquedir \
            --with-wrapper-ldflags="-Wl,-rpath,${instdir}/lib" \
            --with-openib \
            --with-hwloc=$hwlocdir \
            CC=gcc \
            CXX=g++ \
            FC="$FC" \
            F77="$FC" \
            CFLAGS="-O3" \
            CXXFLAGS="-O3" \
            FFLAGS="-O3 $I8FLAG" \
            FCFLAGS="-O3 $I8FLAG"

You need to set FC to either ifort or gfortran (those are the two compilers
that I have used) and set I8FLAG to -fdefault-integer-8 for gfortran or
-i8 for ifort.
Set torquedir to the directory where torque is installed ($torquedir/lib
must contain libtorque.so), if you are running jobs under torque; otherwise
remove the --with-tm=... line.
Set hwlocdir to the directory where you have hwloc installed. You many not
need the -with-hwloc=... option because openmpi comes with a hwloc version
(I don't have experience with that because we install hwloc independently).
Set instdir to the directory where you what to install openmpi.
You may or may not need the --with-openib option depending on whether
you have an Infiniband interconnect.

After configure/make/make install this so compiled version can be used
with dirac without changing the dirac source code.
(there is one caveat: you should make sure that all "count" variables
in MPI calls in dirac are smaller than 2^31-1. I have run into a few cases
when that is not the case; this problem can be overcome by replacing
MPI_Allreduce calls in dirac with a wrapper that calls MPI_Allreduce
repeatedly). This is what I use to setup dirac:

export PATH=$instdir/bin
./setup --prefix=$diracinstdir \
        --fc=mpif90 \
        --cc=mpicc \
        --int64 \
        --explicit-libs="-lmkl_intel_ilp64 -lmkl_sequential -lmkl_core"

where $instdir is the directory where you installed openmpi from above.

I would never use the so-compiled openmpi version for anything other
than dirac though. I am not saying that it cannot work (at a minimum
you need to compile Fortran programs with the appropriate I8FLAG),
but it is an unnecessary complication: I have not encountered a piece
of software other than dirac that requires this.

Cheers,
Martin

-- 
Martin Siegert
Head, Research Computing
WestGrid/ComputeCanada Site Lead
Simon Fraser University
Burnaby, British Columbia
Canada

On Wed, Oct 30, 2013 at 06:00:56PM -0500, Jim Parker wrote:
> 
>    Jeff,
>      Here's what I know:
>    1.  Checked FAQs.  Done
>    2.  Version 1.6.5
>    3. config.log file has been removed by the sysadmin...
>    4. ompi_info -a from head node is in attached as headnode.out
>    5. N/A
>    6. compute node info in attached as compute-x-yy.out
>    7. As discussed, local variables are being overwritten after calls to
>    MPI_RECV from Fortran code
>    8. ifconfig output from head node and computes listed as *-ifconfig.out
>    Cheers,
>    --Jim
> 
>    On Wed, Oct 30, 2013 at 5:29 PM, Jeff Squyres (jsquyres)
>    <[1]jsquy...@cisco.com> wrote:
> 
>      Can you send the information listed here:
>          [2]http://www.open-mpi.org/community/help/
> 
>    On Oct 30, 2013, at 6:22 PM, Jim Parker <[3]jimparker96...@gmail.com>
>    wrote:
>    > Jeff and Ralph,
>    >   Ok, I downshifted to a helloWorld example (attached), bottom line
>    after I hit the MPI_Recv call, my local variable (rank) gets borked.
>    >
>    > I have compiled with -m64 -fdefault-integer-8 and even have assigned
>    kind=8 to the integers (which would be the preferred method in my case)
>    >
>    > Your help is appreciated.
>    >
>    > Cheers,
>    > --Jim
>    >
>    >
>    >
>    > On Wed, Oct 30, 2013 at 4:49 PM, Jeff Squyres (jsquyres)
>    <[4]jsquy...@cisco.com> wrote:
>    > On Oct 30, 2013, at 4:35 PM, Jim Parker <[5]jimparker96...@gmail.com>
>    wrote:
>    >
>    > >   I have recently built a cluster that uses the 64-bit indexing
>    feature of OpenMPI following the directions at
>    > >
>    [6]http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_fo
>    r_64-bit_integers
>    >
>    > That should be correct (i.e., passing -i8 in FFLAGS and FCFLAGS for
>    OMPI 1.6.x).
>    >
>    > > My question is what are the new prototypes for the MPI calls ?
>    > > specifically
>    > > MPI_RECV
>    > > MPI_Allgathterv
>    >
>    > They're the same as they've always been.
>    >
>    > The magic is that the -i8 flag tells the compiler "make all Fortran
>    INTEGERs be 8 bytes, not (the default) 4."  So Ralph's answer was
>    correct in that all the MPI parameters are INTEGERs -- but you can tell
>    the compiler that all INTEGERs are 8 bytes, not 4, and therefore get
>    "large" integers.
>    >
>    > Note that this means that you need to compile your application with
>    -i8, too.  That will make *your* INTEGERs also be 8 bytes, and then
>    you'll match what Open MPI is doing.
>    >
>    > > I'm curious because some off my local variables get killed (set to
>    null) upon my first call to MPI_RECV.  Typically, this is due (in
>    Fortran) to someone not setting the 'status' variable to an appropriate
>    array size.
>    >
>    > If you didn't compile your application with -i8, this could well be
>    because your application is treating INTEGERs as 4 bytes, but OMPI is
>    treating INTEGERs as 8 bytes.  Nothing good can come from that.
>    >
>    > If you *did* compile your application with -i8 and you're seeing this
>    kind of wonkyness, we should dig deeper and see what's going on.
>    >
>    > > My review of mpif.h and mpi.h seem to indicate that the functions
>    are defined as C int types and therefore , I assume, the coercion
>    during the compile makes the library support 64-bit indexing.  ie. int
>    -> long int
>    >
>    > FWIW: We actually define a type MPI_Fint; its actual type is
>    determined by configure (int or long int, IIRC).  When your Fortran
>    code calls C, we use the MPI_Fint type for parameters, and so it will
>    be either a 4 or 8 byte integer type.
>    >
>    > --
>    > Jeff Squyres
>    > [7]jsquy...@cisco.com
>    > For corporate legal information go to:
>    [8]http://www.cisco.com/web/about/doing_business/legal/cri/
>    >
>    > _______________________________________________
>    > users mailing list
>    > [9]us...@open-mpi.org
>    > [10]http://www.open-mpi.org/mailman/listinfo.cgi/users
>    >
> 
>      >
>      <mpi-test-64bit.tar.bz2>____________________________________________
>      ___
> 
>    > users mailing list
>    > [11]us...@open-mpi.org
>    > [12]http://www.open-mpi.org/mailman/listinfo.cgi/users
>    --
>    Jeff Squyres
>    [13]jsquy...@cisco.com
>    For corporate legal information go to:
>    [14]http://www.cisco.com/web/about/doing_business/legal/cri/
>    _______________________________________________
>    users mailing list
>    [15]us...@open-mpi.org
>    [16]http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> References
> 
>    1. mailto:jsquy...@cisco.com
>    2. http://www.open-mpi.org/community/help/
>    3. mailto:jimparker96...@gmail.com
>    4. mailto:jsquy...@cisco.com
>    5. mailto:jimparker96...@gmail.com
>    6. 
> http://wiki.chem.vu.nl/dirac/index.php/How_to_build_MPI_libraries_for_64-bit_integers
>    7. mailto:jsquy...@cisco.com
>    8. http://www.cisco.com/web/about/doing_business/legal/cri/
>    9. mailto:us...@open-mpi.org
>   10. http://www.open-mpi.org/mailman/listinfo.cgi/users
>   11. mailto:us...@open-mpi.org
>   12. http://www.open-mpi.org/mailman/listinfo.cgi/users
>   13. mailto:jsquy...@cisco.com
>   14. http://www.cisco.com/web/about/doing_business/legal/cri/
>   15. mailto:us...@open-mpi.org
>   16. http://www.open-mpi.org/mailman/listinfo.cgi/users


> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to