Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-22 Thread Christof Koehler
Hello again,

I tried to replicate the situation on the workstation at my desk,
running ubuntu 14.04 (gcc 4.8.4) and with the OS supplied lapack and
blas libraries.

With openmpi 2.0.1 (mpirun -np 4 xdsyevr) I get "136 tests completed and 
failed." with the "IL, IU, VL or VU altered by PDSYEVR" message but 
reasonable looking numbers as described before.

With 1.10 I get "136 tests completed and passed residual checks."
instead as observed before.

So this is likely not an Omni-Path problem but something else in 2.0.1.

I should eventually clarify that I am using the current revision 206 from
the scalapack trunk (svn co 
https://icl.cs.utk.edu/svn/scalapack-dev/scalapack/trunk)
but if I remember correctly I had very similar problems with the 2.0.2
release tarball.

Both MPIs were built with 
./configure --with-hwloc=internal --enable-static 
--enable-orterun-prefix-by-default


Best Regards

Christof

On Fri, Nov 18, 2016 at 11:25:06AM -0700, Howard Pritchard wrote:
> Hi Christof,
> 
> Thanks for trying out 2.0.1.  Sorry that you're hitting problems.
> Could you try to run the tests using the 'ob1' PML in order to
> bypass PSM2?
> 
> mpirun --mca pml ob1 (all the rest of the args)
> 
> and see if you still observe the failures?
> 
> Howard
> 
> 
> 2016-11-18 9:32 GMT-07:00 Christof Köhler <
> christof.koeh...@bccms.uni-bremen.de>:
> 
> > Hello everybody,
> >
> > I am observing failures in the xdsyevr (and xssyevr) ScaLapack self tests
> > when running on one or two nodes with OpenMPI 2.0.1. With 1.10.4 no
> > failures are observed. Also, with mvapich2 2.2 no failures are observed.
> > The other testers appear to be working with all MPIs mentioned (have to
> > triple check again). I somehow overlooked the failures below at first.
> >
> > The system is an Intel OmniPath system (newest Intel driver release 10.2),
> > i.e. we are using the PSM2
> > mtl I believe.
> >
> > I built the OpenMPIs with gcc 6.2 and the following identical options:
> > ./configure  FFLAGS="-O1" CFLAGS="-O1" FCFLAGS="-O1" CXXFLAGS="-O1"
> > --with-psm2 --with-tm --with-hwloc=internal --enable-static
> > --enable-orterun-prefix-by-default
> >
> > The ScaLapack build is also with gcc 6.2, openblas 0.2.19 and using "-O1
> > -g" as FCFLAGS and CCFLAGS identical for all tests, only wrapper compiler
> > changes.
> >
> > With OpenMPI 1.10.4 I see on a single node
> >
> >  mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
> > ./xdsyevr
> > 136 tests completed and passed residual checks.
> > 0 tests completed without checking.
> > 0 tests skipped for lack of memory.
> > 0 tests completed and failed.
> >
> > With OpenMPI 1.10.4 I see on two nodes
> >
> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
> > ./xdsyevr
> >   136 tests completed and passed residual checks.
> > 0 tests completed without checking.
> > 0 tests skipped for lack of memory.
> > 0 tests completed and failed.
> >
> > With OpenMPI 2.0.1 I see on a single node
> >
> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
> > ./xdsyevr
> > 32 tests completed and passed residual checks.
> > 0 tests completed without checking.
> > 0 tests skipped for lack of memory.
> >   104 tests completed and failed.
> >
> > With OpenMPI 2.0.1 I see on two nodes
> >
> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
> > ./xdsyevr
> >32 tests completed and passed residual checks.
> > 0 tests completed without checking.
> > 0 tests skipped for lack of memory.
> >   104 tests completed and failed.
> >
> > A typical failure looks like this in the output
> >
> > IL, IU, VL or VU altered by PDSYEVR
> >500   1   1   1   8   Y 0.26-1.00  0.19E-02   15. FAILED
> >500   1   2   1   8   Y 0.29-1.00  0.79E-03   3.9 PASSED
> >  EVR
> > IL, IU, VL or VU altered by PDSYEVR
> >500   1   1   2   8   Y 0.52-1.00  0.82E-03   2.5 FAILED
> >500   1   2   2   8   Y 0.41-1.00  0.79E-03   2.3 PASSED
> >  EVR
> >500   2   2   2   8   Y 0.18-1.00  0.78E-03   3.0 PASSED
> >  EVR
> > IL, IU, VL or VU altered by PDSYEVR
> >500   4   1   4   8   Y 0.09-1.00  0.95E-03   4.1 FAILED
> >500   4   4   1   8   Y 0.11-1.00  0.91E-03   2.8 PASSED
> >  EVR
> >
> >
> > The variable OMP_NUM_THREADS=1 to stop the openblas from threading.
> > We see similar problems with intel 2016 compilers, but I believe gcc is a
> > good baseline.
> >
> > Any ideas ? For us this is a real problem in that we do not know if this
> > indicates a network (transport) issue in the intel software stack (libpsm2,
> > hfi1 kernel module) which might affect our production

Re: [OMPI users] valgrind invalid read

2016-11-22 Thread Yann Jobic

Hi,

I manually changed the file. Moreover i also tried the 1.8.4 openmpi 
version.


I still have this invalid read.

Am i doing something wrong ?

Thanks,

Yann


Le 22/11/2016 à 00:50, Gilles Gouaillardet a écrit :

Yann,


this is a bug that was previously reported, and the fix is pending on 
review.


meanwhile, you can manually apply the patch available at 
https://github.com/open-mpi/ompi/pull/2418



Cheers,


Gilles


On 11/18/2016 9:34 PM, Yann Jobic wrote:

Hi,

I'm using valgrind 3.12 with openmpi 2.0.1.
The code simply send an integer to another process with :
#include 
#include 
#include 

int main (int argc, char **argv) {
  const int tag = 13;
  int size, rank;

  MPI_Init(&argc, &argv);
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  if (size < 2) {
  fprintf(stderr,"Requires at least two processes.\n");
  exit(-1);
  }

  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  if (rank == 0) {
int i=3;
const int dest = 1;

MPI_Send(&i,   1, MPI_INT, dest, tag, MPI_COMM_WORLD);

printf("Rank %d: sent int\n", rank);
  }
  if (rank == 1) {
int j;
const int src=0;
MPI_Status status;

MPI_Recv(&j,   1, MPI_INT, src, tag, MPI_COMM_WORLD, &status);
printf("Rank %d: Received: int = %d\n", rank,j);
  }

  MPI_Finalize();

  return 0;
}


I'm getting the error :
valgrind MPI wrappers 46313: Active for pid 46313
valgrind MPI wrappers 46313: Try MPIWRAP_DEBUG=help for possible options
valgrind MPI wrappers 46314: Active for pid 46314
valgrind MPI wrappers 46314: Try MPIWRAP_DEBUG=help for possible options
Rank 0: sent int
==46314== Invalid read of size 4
==46314==at 0x400A3D: main (basic.c:33)
==46314==  Address 0xffefff594 is on thread 1's stack
==46314==  in frame #0, created by main (basic.c:5)
==46314==
Rank 1: Received: int = 3

The invalid read is at the printf line.

Do you have any clue of why am i getting it ?

I ran the code with :
LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-amd64-linux.so mpirun -np 
2  $prefix/bin/valgrind ./exe


Thanks in advance,

Yann



---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-22 Thread Gilles Gouaillardet
Christoph,

out of curiosity, could you try to
mpirun --mca coll ^tuned ...
and see if it helps ?

Cheers,

Gilles


On Tue, Nov 22, 2016 at 7:21 PM, Christof Koehler
 wrote:
> Hello again,
>
> I tried to replicate the situation on the workstation at my desk,
> running ubuntu 14.04 (gcc 4.8.4) and with the OS supplied lapack and
> blas libraries.
>
> With openmpi 2.0.1 (mpirun -np 4 xdsyevr) I get "136 tests completed and
> failed." with the "IL, IU, VL or VU altered by PDSYEVR" message but
> reasonable looking numbers as described before.
>
> With 1.10 I get "136 tests completed and passed residual checks."
> instead as observed before.
>
> So this is likely not an Omni-Path problem but something else in 2.0.1.
>
> I should eventually clarify that I am using the current revision 206 from
> the scalapack trunk (svn co 
> https://icl.cs.utk.edu/svn/scalapack-dev/scalapack/trunk)
> but if I remember correctly I had very similar problems with the 2.0.2
> release tarball.
>
> Both MPIs were built with
> ./configure --with-hwloc=internal --enable-static 
> --enable-orterun-prefix-by-default
>
>
> Best Regards
>
> Christof
>
> On Fri, Nov 18, 2016 at 11:25:06AM -0700, Howard Pritchard wrote:
>> Hi Christof,
>>
>> Thanks for trying out 2.0.1.  Sorry that you're hitting problems.
>> Could you try to run the tests using the 'ob1' PML in order to
>> bypass PSM2?
>>
>> mpirun --mca pml ob1 (all the rest of the args)
>>
>> and see if you still observe the failures?
>>
>> Howard
>>
>>
>> 2016-11-18 9:32 GMT-07:00 Christof Köhler <
>> christof.koeh...@bccms.uni-bremen.de>:
>>
>> > Hello everybody,
>> >
>> > I am observing failures in the xdsyevr (and xssyevr) ScaLapack self tests
>> > when running on one or two nodes with OpenMPI 2.0.1. With 1.10.4 no
>> > failures are observed. Also, with mvapich2 2.2 no failures are observed.
>> > The other testers appear to be working with all MPIs mentioned (have to
>> > triple check again). I somehow overlooked the failures below at first.
>> >
>> > The system is an Intel OmniPath system (newest Intel driver release 10.2),
>> > i.e. we are using the PSM2
>> > mtl I believe.
>> >
>> > I built the OpenMPIs with gcc 6.2 and the following identical options:
>> > ./configure  FFLAGS="-O1" CFLAGS="-O1" FCFLAGS="-O1" CXXFLAGS="-O1"
>> > --with-psm2 --with-tm --with-hwloc=internal --enable-static
>> > --enable-orterun-prefix-by-default
>> >
>> > The ScaLapack build is also with gcc 6.2, openblas 0.2.19 and using "-O1
>> > -g" as FCFLAGS and CCFLAGS identical for all tests, only wrapper compiler
>> > changes.
>> >
>> > With OpenMPI 1.10.4 I see on a single node
>> >
>> >  mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
>> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
>> > ./xdsyevr
>> > 136 tests completed and passed residual checks.
>> > 0 tests completed without checking.
>> > 0 tests skipped for lack of memory.
>> > 0 tests completed and failed.
>> >
>> > With OpenMPI 1.10.4 I see on two nodes
>> >
>> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
>> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
>> > ./xdsyevr
>> >   136 tests completed and passed residual checks.
>> > 0 tests completed without checking.
>> > 0 tests skipped for lack of memory.
>> > 0 tests completed and failed.
>> >
>> > With OpenMPI 2.0.1 I see on a single node
>> >
>> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
>> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
>> > ./xdsyevr
>> > 32 tests completed and passed residual checks.
>> > 0 tests completed without checking.
>> > 0 tests skipped for lack of memory.
>> >   104 tests completed and failed.
>> >
>> > With OpenMPI 2.0.1 I see on two nodes
>> >
>> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
>> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
>> > ./xdsyevr
>> >32 tests completed and passed residual checks.
>> > 0 tests completed without checking.
>> > 0 tests skipped for lack of memory.
>> >   104 tests completed and failed.
>> >
>> > A typical failure looks like this in the output
>> >
>> > IL, IU, VL or VU altered by PDSYEVR
>> >500   1   1   1   8   Y 0.26-1.00  0.19E-02   15. FAILED
>> >500   1   2   1   8   Y 0.29-1.00  0.79E-03   3.9 PASSED
>> >  EVR
>> > IL, IU, VL or VU altered by PDSYEVR
>> >500   1   1   2   8   Y 0.52-1.00  0.82E-03   2.5 FAILED
>> >500   1   2   2   8   Y 0.41-1.00  0.79E-03   2.3 PASSED
>> >  EVR
>> >500   2   2   2   8   Y 0.18-1.00  0.78E-03   3.0 PASSED
>> >  EVR
>> > IL, IU, VL or VU altered by PDSYEVR
>> >500   4   1   4   8   Y 0.09-1.00  0.95E-03   4.1 FAILED
>> >500   4   4   1   8   Y 0.11-1.00  0.91E-03   2.8 PASSED
>> >  EVR
>> >
>> >
>> > The variable OMP_NUM_THREADS=1 to stop the openblas from threading.
>>

Re: [OMPI users] valgrind invalid read

2016-11-22 Thread Gilles Gouaillardet
Yann,

my bad, the patch you need is at
https://github.com/open-mpi/ompi/pull/2368.patch

sorry for the confusion,

Gilles


On Tue, Nov 22, 2016 at 8:33 PM, Yann Jobic  wrote:
> Hi,
>
> I manually changed the file. Moreover i also tried the 1.8.4 openmpi
> version.
>
> I still have this invalid read.
>
> Am i doing something wrong ?
>
> Thanks,
>
> Yann
>
>
>
> Le 22/11/2016 à 00:50, Gilles Gouaillardet a écrit :
>>
>> Yann,
>>
>>
>> this is a bug that was previously reported, and the fix is pending on
>> review.
>>
>> meanwhile, you can manually apply the patch available at
>> https://github.com/open-mpi/ompi/pull/2418
>>
>>
>> Cheers,
>>
>>
>> Gilles
>>
>>
>> On 11/18/2016 9:34 PM, Yann Jobic wrote:
>>>
>>> Hi,
>>>
>>> I'm using valgrind 3.12 with openmpi 2.0.1.
>>> The code simply send an integer to another process with :
>>> #include 
>>> #include 
>>> #include 
>>>
>>> int main (int argc, char **argv) {
>>>   const int tag = 13;
>>>   int size, rank;
>>>
>>>   MPI_Init(&argc, &argv);
>>>   MPI_Comm_size(MPI_COMM_WORLD, &size);
>>>
>>>   if (size < 2) {
>>>   fprintf(stderr,"Requires at least two processes.\n");
>>>   exit(-1);
>>>   }
>>>
>>>   MPI_Comm_rank(MPI_COMM_WORLD, &rank);
>>>
>>>   if (rank == 0) {
>>> int i=3;
>>> const int dest = 1;
>>>
>>> MPI_Send(&i,   1, MPI_INT, dest, tag, MPI_COMM_WORLD);
>>>
>>> printf("Rank %d: sent int\n", rank);
>>>   }
>>>   if (rank == 1) {
>>> int j;
>>> const int src=0;
>>> MPI_Status status;
>>>
>>> MPI_Recv(&j,   1, MPI_INT, src, tag, MPI_COMM_WORLD, &status);
>>> printf("Rank %d: Received: int = %d\n", rank,j);
>>>   }
>>>
>>>   MPI_Finalize();
>>>
>>>   return 0;
>>> }
>>>
>>>
>>> I'm getting the error :
>>> valgrind MPI wrappers 46313: Active for pid 46313
>>> valgrind MPI wrappers 46313: Try MPIWRAP_DEBUG=help for possible options
>>> valgrind MPI wrappers 46314: Active for pid 46314
>>> valgrind MPI wrappers 46314: Try MPIWRAP_DEBUG=help for possible options
>>> Rank 0: sent int
>>> ==46314== Invalid read of size 4
>>> ==46314==at 0x400A3D: main (basic.c:33)
>>> ==46314==  Address 0xffefff594 is on thread 1's stack
>>> ==46314==  in frame #0, created by main (basic.c:5)
>>> ==46314==
>>> Rank 1: Received: int = 3
>>>
>>> The invalid read is at the printf line.
>>>
>>> Do you have any clue of why am i getting it ?
>>>
>>> I ran the code with :
>>> LD_PRELOAD=$prefix/lib/valgrind/libmpiwrap-amd64-linux.so mpirun -np 2
>>> $prefix/bin/valgrind ./exe
>>>
>>> Thanks in advance,
>>>
>>> Yann
>>>
>
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] ScaLapack tester fails with 2.0.1, works with 1.10.4; Intel Omni-Path

2016-11-22 Thread Christof Koehler
Hello,

On Tue, Nov 22, 2016 at 10:35:57PM +0900, Gilles Gouaillardet wrote:
> Christoph,
> 
> out of curiosity, could you try to
> mpirun --mca coll ^tuned ...
> and see if it helps ?

No, at least not for the workstation example. I will test with my laptop
(debian stable) tomorrow.

Thank you all for your help ! This is really strange.

Cheers


Christof

> 
> Cheers,
> 
> Gilles
> 
> 
> On Tue, Nov 22, 2016 at 7:21 PM, Christof Koehler
>  wrote:
> > Hello again,
> >
> > I tried to replicate the situation on the workstation at my desk,
> > running ubuntu 14.04 (gcc 4.8.4) and with the OS supplied lapack and
> > blas libraries.
> >
> > With openmpi 2.0.1 (mpirun -np 4 xdsyevr) I get "136 tests completed and
> > failed." with the "IL, IU, VL or VU altered by PDSYEVR" message but
> > reasonable looking numbers as described before.
> >
> > With 1.10 I get "136 tests completed and passed residual checks."
> > instead as observed before.
> >
> > So this is likely not an Omni-Path problem but something else in 2.0.1.
> >
> > I should eventually clarify that I am using the current revision 206 from
> > the scalapack trunk (svn co 
> > https://icl.cs.utk.edu/svn/scalapack-dev/scalapack/trunk)
> > but if I remember correctly I had very similar problems with the 2.0.2
> > release tarball.
> >
> > Both MPIs were built with
> > ./configure --with-hwloc=internal --enable-static 
> > --enable-orterun-prefix-by-default
> >
> >
> > Best Regards
> >
> > Christof
> >
> > On Fri, Nov 18, 2016 at 11:25:06AM -0700, Howard Pritchard wrote:
> >> Hi Christof,
> >>
> >> Thanks for trying out 2.0.1.  Sorry that you're hitting problems.
> >> Could you try to run the tests using the 'ob1' PML in order to
> >> bypass PSM2?
> >>
> >> mpirun --mca pml ob1 (all the rest of the args)
> >>
> >> and see if you still observe the failures?
> >>
> >> Howard
> >>
> >>
> >> 2016-11-18 9:32 GMT-07:00 Christof Köhler <
> >> christof.koeh...@bccms.uni-bremen.de>:
> >>
> >> > Hello everybody,
> >> >
> >> > I am observing failures in the xdsyevr (and xssyevr) ScaLapack self tests
> >> > when running on one or two nodes with OpenMPI 2.0.1. With 1.10.4 no
> >> > failures are observed. Also, with mvapich2 2.2 no failures are observed.
> >> > The other testers appear to be working with all MPIs mentioned (have to
> >> > triple check again). I somehow overlooked the failures below at first.
> >> >
> >> > The system is an Intel OmniPath system (newest Intel driver release 
> >> > 10.2),
> >> > i.e. we are using the PSM2
> >> > mtl I believe.
> >> >
> >> > I built the OpenMPIs with gcc 6.2 and the following identical options:
> >> > ./configure  FFLAGS="-O1" CFLAGS="-O1" FCFLAGS="-O1" CXXFLAGS="-O1"
> >> > --with-psm2 --with-tm --with-hwloc=internal --enable-static
> >> > --enable-orterun-prefix-by-default
> >> >
> >> > The ScaLapack build is also with gcc 6.2, openblas 0.2.19 and using "-O1
> >> > -g" as FCFLAGS and CCFLAGS identical for all tests, only wrapper compiler
> >> > changes.
> >> >
> >> > With OpenMPI 1.10.4 I see on a single node
> >> >
> >> >  mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> >> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
> >> > ./xdsyevr
> >> > 136 tests completed and passed residual checks.
> >> > 0 tests completed without checking.
> >> > 0 tests skipped for lack of memory.
> >> > 0 tests completed and failed.
> >> >
> >> > With OpenMPI 1.10.4 I see on two nodes
> >> >
> >> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> >> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
> >> > ./xdsyevr
> >> >   136 tests completed and passed residual checks.
> >> > 0 tests completed without checking.
> >> > 0 tests skipped for lack of memory.
> >> > 0 tests completed and failed.
> >> >
> >> > With OpenMPI 2.0.1 I see on a single node
> >> >
> >> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> >> > oob_tcp_if_include eth0,team0 -host node009,node009,node009,node009
> >> > ./xdsyevr
> >> > 32 tests completed and passed residual checks.
> >> > 0 tests completed without checking.
> >> > 0 tests skipped for lack of memory.
> >> >   104 tests completed and failed.
> >> >
> >> > With OpenMPI 2.0.1 I see on two nodes
> >> >
> >> > mpirun -n 4 -x PATH -x LD_LIBRARY_PATH -x OMP_NUM_THREADS -mca
> >> > oob_tcp_if_include eth0,team0 -host node009,node010,node009,node010
> >> > ./xdsyevr
> >> >32 tests completed and passed residual checks.
> >> > 0 tests completed without checking.
> >> > 0 tests skipped for lack of memory.
> >> >   104 tests completed and failed.
> >> >
> >> > A typical failure looks like this in the output
> >> >
> >> > IL, IU, VL or VU altered by PDSYEVR
> >> >500   1   1   1   8   Y 0.26-1.00  0.19E-02   15. FAILED
> >> >500   1   2   1   8   Y 0.29-1.00  0.79E-03   3.9 PASSED
> >> >  EVR
> >> > IL, IU, VL or VU altered by PDSYEVR
> >> >500   1   1   2

[OMPI users] Follow-up to Open MPI SC'16 BOF

2016-11-22 Thread Howard Pritchard
Hello Folks,

This is a followup to the question posed at the SC’16 Open MPI BOF:  Would
the community prefer to have a v2.2.x limited feature but backwards
compatible release sometime in 2017, or would the community prefer a v3.x
(not backwards compatible but potentially more features) sometime in late
2017 to early 2018?

BOF attendees expressed an interest in having a list of features that might
make it in to v2.2.x and ones that the Open MPI developers think would be
too hard to back port from the development branch (master) to a v2.2.x
release stream.

Here are the requested lists:

Features that we anticipate we could port to a v2.2.x release

   1. Improved collective performance (a new “tuned” module)
   2. Enable Linux CMA shared memory support by default
   3. PMIx 3.0 (If new functionality were to be used in this release of
   Open MPI)

Features that we anticipate would be too difficult to port to a v2.2.x
release

   1. Revamped CUDA support
   2. MPI_ALLOC_MEM integration with memkind
   3. OpenMP affinity/placement integration
   4. THREAD_MULTIPLE improvements to MTLs (not so clear on the level of
   difficult for this one)

You can register your opinion on whether to go with a v2.2.x release next
year or to go from v2.1.x to v3.x in late 2017 or early 2018 at the link
below:

https://www.open-mpi.org/sc16/

Thanks very much,

Howard

-- 

Howard Pritchard

HPC-DES

Los Alamos National Laboratory
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Follow-up to Open MPI SC'16 BOF

2016-11-22 Thread Jeff Hammond
>
>
>
>1. MPI_ALLOC_MEM integration with memkind
>
> It would sense to prototype this as a standalone project that is
integrated with any MPI library via PMPI.  It's probably a day or two of
work to get that going.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] Follow-up to Open MPI SC'16 BOF

2016-11-22 Thread Howard Pritchard
Hi Jeff,

I don't think it was the use of memkind itself, but a need to refactor the
way Open MPI is using info objects
that was the issue.  I don't recall the details.

Howard


2016-11-22 16:27 GMT-07:00 Jeff Hammond :

>
>>
>>1. MPI_ALLOC_MEM integration with memkind
>>
>> It would sense to prototype this as a standalone project that is
> integrated with any MPI library via PMPI.  It's probably a day or two of
> work to get that going.
>
> Jeff
>
> --
> Jeff Hammond
> jeff.scie...@gmail.com
> http://jeffhammond.github.io/
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users