[OMPI users] [OMPI USERS] Cross-compiling

2017-11-14 Thread Alberto Ortiz
Hi,
I am trying to run in this type of environment:

1- A linux PC in which I intend to compile MPI programs for arm embedded
processors
2- The arms

I have OpenMPI compiled in the arms with dynamic libraries, in case I
compile natively as well as for the use of 'mpirun' when I get the
cross-compiled programs from the PC.

For some reason, I need to cross-compile MPI programs in the PC to use in
the embedded system, so I believe I need to compile OpenMPI on the PC and
select the cross-compiler as well as the host in which the programs will
run. Even though, I don't seem to get it right.

I have tried the following configure options:
../openmpi-3.0.0/configure  --enable-static --disable-shared
--host=arm-linux --disable-mpi-fortran --prefix=/home/user/openmpi-install/
CC=arm-linux-gnueabihf-gcc CXX=arm-linux-gnueabihf-g++

My intention is to run 'mpicc' on the PC so that it calls the
cross-compiler specified with 'CC' as well as all MPI flags, generating a
program with static-linking that would run on the arm embedded processors.

Thank you in advance,
Alberto
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Build options

2017-11-14 Thread Bennet Fauber
We are trying SLURM for the first time, and prior to this I've always built
OMPI with Torque support.  I was hoping that someone with more experience
than I with both OMPI and SLURM might provide a bit of up-front advice.

My situation is that we are running CentOS 7.3 (soon to be 7.4), we use
Mellanox cards of several generations, but my systems team tells me the
driver version is the same everywhere.

82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE] (rev b0)

16:00.0 Network controller: Mellanox Technologies MT27500 Family
[ConnectX-3]

We have mixed NFSv3 shared directories and a Lustre filesystem (DDN).

In the past, we had issues with using `dlopen` and we've had much grief
with OMPI placing jobs on processors correctly, we think because we used
cpusets at one point and use cgroups now and jobs share nodes with other
jobs.  My previous build options were

export CONFIGURE_FLAGS='--disable-dlopen --enable-shared'
export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran'
export COMP_NAME='gcc-4.8.5'
export PREFIX=/shared/nfs/directory

./configure \
--prefix=${PREFIX} \
--mandir=${PREFIX}/share/man \
--with-tm \
--with-verbs \
$CONFIGURE_FLAGS \
$COMPILERS

Additionally, we have typically included the following lines in our
$PREFIX/etc/openmpi-mca-params.conf

orte_hetero_nodes=1
hwloc_base_binding_policy=none

Those may be there for purely historical reasons.  So far as I know, there
is no deterministic test recorded anywhere that would detect whether those
are still needed.

For this new resource manager, I am thinking that the compiler flags stay
the same, but the configure be changed to

./configure \
--prefix=${PREFIX} \
--mandir=${PREFIX}/share/man \
--with-slurm \
--with-pmi=/usr/include/slurm \
--with-verbs \
$CONFIGURE_FLAGS \
$COMPILERS

I am curious, what file system support does --lustre-support enable?

I will be installing three versions of OMPI to start:  1.10.7, 2.1.2, and
3.0.0.  Are there changes to the configure line that are a priori known to
be needed.

There are references on the FAQ and other installation notes that lead me
to believe they are a bit out of date, so I am asking preemptively here.
Apologies if that is an incorrect assessment.

Thanks,  -- bennet
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Build options

2017-11-14 Thread David Lee Braun
Hi Bennet,

what is the issue you have with dlopen?  and what options do you use
with mpi --bind?

i think the only change i make to my openmpi compile is to added
'--with-cuda=...' and '--with-pmi=...'

D

On 11/14/2017 10:01 AM, Bennet Fauber wrote:
> We are trying SLURM for the first time, and prior to this I've always
> built OMPI with Torque support.  I was hoping that someone with more
> experience than I with both OMPI and SLURM might provide a bit of
> up-front advice.
> 
> My situation is that we are running CentOS 7.3 (soon to be 7.4), we use
> Mellanox cards of several generations, but my systems team tells me the
> driver version is the same everywhere.
> 
> 82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
> 5GT/s - IB QDR / 10GigE] (rev b0)
> 
> 16:00.0 Network controller: Mellanox Technologies MT27500 Family
> [ConnectX-3]
> 
> We have mixed NFSv3 shared directories and a Lustre filesystem (DDN).
> 
> In the past, we had issues with using `dlopen` and we've had much grief
> with OMPI placing jobs on processors correctly, we think because we used
> cpusets at one point and use cgroups now and jobs share nodes with other
> jobs.  My previous build options were
> 
> export CONFIGURE_FLAGS='--disable-dlopen --enable-shared'
> export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran'
> export COMP_NAME='gcc-4.8.5'
> export PREFIX=/shared/nfs/directory
> 
> ./configure \
> --prefix=${PREFIX} \
> --mandir=${PREFIX}/share/man \
> --with-tm \
> --with-verbs \
> $CONFIGURE_FLAGS \
> $COMPILERS
> 
> Additionally, we have typically included the following lines in our
> $PREFIX/etc/openmpi-mca-params.conf
> 
> orte_hetero_nodes=1
> hwloc_base_binding_policy=none
> 
> Those may be there for purely historical reasons.  So far as I know,
> there is no deterministic test recorded anywhere that would detect
> whether those are still needed.
> 
> For this new resource manager, I am thinking that the compiler flags
> stay the same, but the configure be changed to
> 
> ./configure \
> --prefix=${PREFIX} \
> --mandir=${PREFIX}/share/man \
> --with-slurm \
> --with-pmi=/usr/include/slurm \
> --with-verbs \
> $CONFIGURE_FLAGS \
> $COMPILERS
> 
> I am curious, what file system support does --lustre-support enable?
> 
> I will be installing three versions of OMPI to start:  1.10.7, 2.1.2,
> and 3.0.0.  Are there changes to the configure line that are a priori
> known to be needed.
> 
> There are references on the FAQ and other installation notes that lead
> me to believe they are a bit out of date, so I am asking preemptively
> here.  Apologies if that is an incorrect assessment.
> 
> Thanks,  -- bennet
> 
> 
> 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
> 

-- 
David Lee Braun
Manager of Computational Facilities
for Dr Charles L. Brooks, III Ph.D.
930 N. University Ave
Chemistry 2006
(734) 615-1450
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Build options

2017-11-14 Thread Bennet Fauber
David,

Thanks for the reply.  I believe the dlopen and Rmpi don't get along
because the Rmpi uses fork.  That's a vague recollection from several years
ago.  R is pretty important for us.  I believe that leaving dlopen enabled
also hits our NFS server harder with I/O requests for the modules.

-- bennet



On Tue, Nov 14, 2017 at 11:53 AM, David Lee Braun  wrote:

> Hi Bennet,
>
> what is the issue you have with dlopen?  and what options do you use
> with mpi --bind?
>
> i think the only change i make to my openmpi compile is to added
> '--with-cuda=...' and '--with-pmi=...'
>
> D
>
> On 11/14/2017 10:01 AM, Bennet Fauber wrote:
> > We are trying SLURM for the first time, and prior to this I've always
> > built OMPI with Torque support.  I was hoping that someone with more
> > experience than I with both OMPI and SLURM might provide a bit of
> > up-front advice.
> >
> > My situation is that we are running CentOS 7.3 (soon to be 7.4), we use
> > Mellanox cards of several generations, but my systems team tells me the
> > driver version is the same everywhere.
> >
> > 82:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
> > 5GT/s - IB QDR / 10GigE] (rev b0)
> >
> > 16:00.0 Network controller: Mellanox Technologies MT27500 Family
> > [ConnectX-3]
> >
> > We have mixed NFSv3 shared directories and a Lustre filesystem (DDN).
> >
> > In the past, we had issues with using `dlopen` and we've had much grief
> > with OMPI placing jobs on processors correctly, we think because we used
> > cpusets at one point and use cgroups now and jobs share nodes with other
> > jobs.  My previous build options were
> >
> > export CONFIGURE_FLAGS='--disable-dlopen --enable-shared'
> > export COMPILERS='CC=gcc CXX=g++ FC=gfortran F77=gfortran'
> > export COMP_NAME='gcc-4.8.5'
> > export PREFIX=/shared/nfs/directory
> >
> > ./configure \
> > --prefix=${PREFIX} \
> > --mandir=${PREFIX}/share/man \
> > --with-tm \
> > --with-verbs \
> > $CONFIGURE_FLAGS \
> > $COMPILERS
> >
> > Additionally, we have typically included the following lines in our
> > $PREFIX/etc/openmpi-mca-params.conf
> >
> > orte_hetero_nodes=1
> > hwloc_base_binding_policy=none
> >
> > Those may be there for purely historical reasons.  So far as I know,
> > there is no deterministic test recorded anywhere that would detect
> > whether those are still needed.
> >
> > For this new resource manager, I am thinking that the compiler flags
> > stay the same, but the configure be changed to
> >
> > ./configure \
> > --prefix=${PREFIX} \
> > --mandir=${PREFIX}/share/man \
> > --with-slurm \
> > --with-pmi=/usr/include/slurm \
> > --with-verbs \
> > $CONFIGURE_FLAGS \
> > $COMPILERS
> >
> > I am curious, what file system support does --lustre-support enable?
> >
> > I will be installing three versions of OMPI to start:  1.10.7, 2.1.2,
> > and 3.0.0.  Are there changes to the configure line that are a priori
> > known to be needed.
> >
> > There are references on the FAQ and other installation notes that lead
> > me to believe they are a bit out of date, so I am asking preemptively
> > here.  Apologies if that is an incorrect assessment.
> >
> > Thanks,  -- bennet
> >
> >
> >
> >
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
>
> --
> David Lee Braun
> Manager of Computational Facilities
> for Dr Charles L. Brooks, III Ph.D.
> 930 N. University Ave
> Chemistry 2006
> (734) 615-1450
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Help with binding processes correctly in Hybrid code (Openmpi +openmp)

2017-11-14 Thread Anil K. Dasanna
Hello all,

I am relatively new to mpi computing. I am doing particle simulations.
So far, I only used pure mpi and I never had a problem. But for my system,
its the best if one uses hybrid programming.
But I always fail to correctly bind all processes and receive binding
errors from cluster.
Could one of you please clarify correct parameters for mpirun for below two
cases:

1) I would like to use lets say two mpi tasks and 16 processors as openmp
threads.
I request nodes:2:ppn=16.
2) For the same case, how should i give parameters such that i have 4 mpi
tasks and
8 openmp threads.

I also tried options with --map-by and it happened that sometimes both mpi
tasks are being selected
from same node and other node is idle. I really appreciate your help.
And my openmp version is 1.8.


-- 
Kind Regards,
Anil.
*
"Its impossible" - said Pride
"Its risky" - said Experience
"Its pointless" - said Reason
"Give it a try" - whispered the heart
*
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Help with binding processes correctly in Hybrid code (Openmpi +openmp)

2017-11-14 Thread Gilles Gouaillardet
Hi,

per 
https://www2.cisl.ucar.edu/resources/computational-systems/cheyenne/running-jobs/pbs-pro-job-script-examples,
you can try

#PBS -l select=2:ncpus=16:mpiprocs=2:ompthreads=8

Cheers,

Gilles


On Tue, Nov 14, 2017 at 4:32 PM, Anil K. Dasanna
 wrote:
> Hello all,
>
> I am relatively new to mpi computing. I am doing particle simulations.
> So far, I only used pure mpi and I never had a problem. But for my system,
> its the best if one uses hybrid programming.
> But I always fail to correctly bind all processes and receive binding errors
> from cluster.
> Could one of you please clarify correct parameters for mpirun for below two
> cases:
>
> 1) I would like to use lets say two mpi tasks and 16 processors as openmp
> threads.
> I request nodes:2:ppn=16.
> 2) For the same case, how should i give parameters such that i have 4 mpi
> tasks and
> 8 openmp threads.
>
> I also tried options with --map-by and it happened that sometimes both mpi
> tasks are being selected
> from same node and other node is idle. I really appreciate your help.
> And my openmp version is 1.8.
>
>
> --
> Kind Regards,
> Anil.
> *
> "Its impossible" - said Pride
> "Its risky" - said Experience
> "Its pointless" - said Reason
> "Give it a try" - whispered the heart
> *
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users