[OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently
I have succesfully been using OpenMPI 1.8.3 compiled with Intel-14, using ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix --enable-mpi-thread-multiple --disable-vt --with-scif=no I have now switched to Intel 15.0.1, and configuring with the same options, I get minor changes in config.log about warning spotting, but it makes all the binaries, and I can compile my own fortran code with mpif90/mpicc but a command 'mpiexec --verbose -n 12 ./fortran_binary' does nothing I checked the FAQ and started using ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix --enable-mpi-thread-multiple --disable-vt --with-scif=no CC=icc CXX=icpc F77=ifort FC=ifort but that makes no difference. Only with -d do I get any more information mpirun -d --verbose -n 12 /home/jbray/5.0/mic2/one/intel-15_openmpi-1.8.3/one_f_debug.exe [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 [mic2:21851] top: openmpi-sessions-jbray@mic2_0 [mic2:21851] tmp: /tmp [mic2:21851] sess_dir_cleanup: job session dir does not exist [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 [mic2:21851] top: openmpi-sessions-jbray@mic2_0 [mic2:21851] tmp: /tmp [mic2:21851] sess_dir_finalize: proc session dir does not exist <12 times> [mic2:21851] sess_dir_cleanup: job session dir does not exist exiting with status 139 My C codes do not have this problem Compiler options are mpicxx -g -O0 -fno-inline-functions -openmp -o one_c_debug.exe async.c collective.c compute.c memory.c one.c openmp.c p2p.c variables.c auditmpi.c control.c inout.c perfio.c ring.c wave.c io.c leak.c mpiio.c pthreads.c -openmp -lpthread mpif90 -g -O0 -fno-inline-functions -openmp -o one_f_debug.exe control.o io.f90 leak.f90 memory.f90 one.f90 ring.f90 slow.f90 swapbounds.f90 variables.f90 wave.f90 *.F90 -openmp Any suggestions as to what is upsetting Fortran with Intel-15 John
Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently
More investigation suggests its the use of -fopenmp (and also its new name -qopenmp) just to compile in OpenMP code, even if it is never executed mpiexec -n 12 ./one_f_debug.exe fails silently mpiexec -n 2 ./one_f_debug.exe has a segfault Both the segfault and the reason why changing the process count suppresses it are still a mystery John On 17 November 2014 16:41, John Bray wrote: > I have succesfully been using OpenMPI 1.8.3 compiled with Intel-14, using > > ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix > --enable-mpi-thread-multiple --disable-vt --with-scif=no > > I have now switched to Intel 15.0.1, and configuring with the same > options, I get minor changes in config.log about warning spotting, but it > makes all the binaries, and I can compile my own fortran code with > mpif90/mpicc > > but a command 'mpiexec --verbose -n 12 ./fortran_binary' does nothing > > I checked the FAQ and started using > > ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix > --enable-mpi-thread-multiple --disable-vt --with-scif=no CC=icc CXX=icpc > F77=ifort FC=ifort > > but that makes no difference. > > Only with -d do I get any more information > > mpirun -d --verbose -n 12 > /home/jbray/5.0/mic2/one/intel-15_openmpi-1.8.3/one_f_debug.exe > [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 > [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 > [mic2:21851] top: openmpi-sessions-jbray@mic2_0 > [mic2:21851] tmp: /tmp > [mic2:21851] sess_dir_cleanup: job session dir does not exist > [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 > [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 > [mic2:21851] top: openmpi-sessions-jbray@mic2_0 > [mic2:21851] tmp: /tmp > [mic2:21851] sess_dir_finalize: proc session dir does not exist > <12 times> > > > [mic2:21851] sess_dir_cleanup: job session dir does not exist > exiting with status 139 > > My C codes do not have this problem > > Compiler options are > > mpicxx -g -O0 -fno-inline-functions -openmp -o one_c_debug.exe async.c > collective.c compute.c memory.c one.c openmp.c p2p.c variables.c > auditmpi.c control.c inout.c perfio.c ring.c wave.c io.c leak.c mpiio.c > pthreads.c -openmp -lpthread > > mpif90 -g -O0 -fno-inline-functions -openmp -o one_f_debug.exe control.o > io.f90 leak.f90 memory.f90 one.f90 ring.f90 slow.f90 swapbounds.f90 > variables.f90 wave.f90 *.F90 -openmp > > Any suggestions as to what is upsetting Fortran with Intel-15 > > John > > > > > >
Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently
A delightful bug this, you get a segfault if you code contains a random_number call and is compiled with -fopenmp, EVEN IF YOU CANNOT CALL IT! program fred use mpi integer :: ierr call mpi_init(ierr) print *,"hello" call mpi_finalize(ierr) contains subroutine sub real :: a(10) call random_number(a) end subroutine sub end program fred The segfault is nothing to do with OpenMPI, but there remains a mystery as to why I only get the segfault error messages on lower node counts. mpif90 -O0 -fopenmp ./fred.f90 mpiexec -n 6 ./a.out -- mpiexec noticed that process rank 4 with PID 28402 on node mic2 exited on signal 11 (Segmentation fault). -- jbray@mic2:intel-15_openmpi-1.8.3% mpiexec -n 12 ./a.out It was the silence that made me raise the issue here. I am running on a 12 physical core hyperthreaded Xeon Phi. Is there something in OpenMPI that is suppressing the messages, as I am getting 4/5 core files each time. John On 18 November 2014 04:24, Ralph Castain wrote: > Just checked the head of the 1.8 branch (soon to be released as 1.8.4), > and confirmed the same results. I know the thread-multiple option is still > broken there, but will test that once we get the final fix committed. > > > On Mon, Nov 17, 2014 at 7:29 PM, Ralph Castain wrote: > >> FWIW: I don't have access to a Linux box right now, but I built the OMPI >> devel master on my Mac using Intel 2015 compilers and was able to build/run >> all of the Fortran examples in our "examples" directory. >> >> I suspect the problem here is your use of the >> --enable-mpi-thread-multiple option. The 1.8 series had an issue with >> that option - we are in the process of fixing it (I'm waiting for an >> updated patch), and you might be hitting it. >> >> If you remove that configure option, do things then work? >> Ralph >> >> >> On Mon, Nov 17, 2014 at 5:56 PM, Gilles Gouaillardet < >> gilles.gouaillar...@iferc.org> wrote: >> >>> Hi John, >>> >>> do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ? >>> >>> does your program calls MPI anywhere from an OpenMP region ? >>> does your program calls MPI only within an !$OMP MASTER section ? >>> does your program does not invoke MPI at all from any OpenMP region ? >>> >>> can you reproduce this issue with a simple fortran program ? or can you >>> publish all your files ? >>> >>> Cheers, >>> >>> Gilles >>> >>> >>> On 2014/11/18 1:41, John Bray wrote: >>> >>> I have succesfully been using OpenMPI 1.8.3 compiled with Intel-14, using >>> >>> ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix >>> --enable-mpi-thread-multiple --disable-vt --with-scif=no >>> >>> I have now switched to Intel 15.0.1, and configuring with the same options, >>> I get minor changes in config.log about warning spotting, but it makes all >>> the binaries, and I can compile my own fortran code with mpif90/mpicc >>> >>> but a command 'mpiexec --verbose -n 12 ./fortran_binary' does nothing >>> >>> I checked the FAQ and started using >>> >>> ./configure --prefix=/usr/local/mpi/$(basename $PWD) --with-threads=posix >>> --enable-mpi-thread-multiple --disable-vt --with-scif=no CC=icc CXX=icpc >>> F77=ifort FC=ifort >>> >>> but that makes no difference. >>> >>> Only with -d do I get any more information >>> >>> mpirun -d --verbose -n 12 >>> /home/jbray/5.0/mic2/one/intel-15_openmpi-1.8.3/one_f_debug.exe >>> [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 >>> [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 >>> [mic2:21851] top: openmpi-sessions-jbray@mic2_0 >>> [mic2:21851] tmp: /tmp >>> [mic2:21851] sess_dir_cleanup: job session dir does not exist >>> [mic2:21851] procdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0/0 >>> [mic2:21851] jobdir: /tmp/openmpi-sessions-jbray@mic2_0/27642/0 >>> [mic2:21851] top: openmpi-sessions-jbray@mic2_0 >>> [mic2:21851] tmp: /tmp >>> [mic2:21851] sess_dir_finalize: proc session dir does not exist >>> <12 times> >>> >>> >>> [mic2:21851] sess_dir_cleanup: job session dir does not exist >>> exiting with status 139 >>> >>> My C codes do not have this problem >>> >>> Compiler options are >>> >
Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with Intel-15 does nothing silently
The original problem used a separate file and not a module. Its clearly a bizarre Intel bug, I am only continuing to persue it here as I'm curious as to why the segfault messages disappear at higher process counts John On 18 November 2014 09:58, wrote: > It may be possibly a bug in Intel-15.0 . > > I suspect it has to do with the contains-block and with the fact, that > you call an intrinsic sbr in that contains-block. > > Normally this must work. You may try to separate the influence of both: > > > > What happens with these 3 variants of your code: > > > > variant a): using an own sbr instead of the intrinsic sbr > > > > program fred > use mpi > integer :: ierr > call mpi_init(ierr) > print *,"hello" > call mpi_finalize(ierr) > contains > subroutine sub > real :: a(10) > call mydummy_random_number(a) >end subroutine sub > >subroutine mydummy_random_number(a) > > real :: a(10) > > print *,’---I am in sbr mydummy_random_number’ > >end subroutine mydummy_random_number > > end program fred > > > > > > variant b): removing the contains-block > > > > program fred > use mpi > integer :: ierr > call mpi_init(ierr) > print *,"hello" > call mpi_finalize(ierr) > > end program fred > > ! > > subroutine sub > real :: a(10) > call random_number(a) > end subroutine sub > > > > variant c): moving the contains-block into a module > > > > module MYMODULE > > contains > > subroutine sub > real :: a(10) > call random_number(a) >end subroutine sub > > end module MYMODULE > > ! > > program fred > > use MYMODULE > use mpi > integer :: ierr > call mpi_init(ierr) > print *,"hello" > call mpi_finalize(ierr) > end program fred > > > > > > Greetings > > Michael Rachner > > > > > > > > *Von:* users [mailto:users-boun...@open-mpi.org] *Im Auftrag von *John > Bray > *Gesendet:* Dienstag, 18. November 2014 10:10 > *An:* Open MPI Users > *Betreff:* Re: [OMPI users] Fortran and OpenMPI 1.8.3 compiled with > Intel-15 does nothing silently > > > > A delightful bug this, you get a segfault if you code contains a > random_number call and is compiled with -fopenmp, EVEN IF YOU CANNOT CALL > IT! > > program fred > use mpi > integer :: ierr > call mpi_init(ierr) > print *,"hello" > call mpi_finalize(ierr) > contains > subroutine sub > real :: a(10) > call random_number(a) >end subroutine sub > end program fred > > The segfault is nothing to do with OpenMPI, but there remains a mystery as > to why I only get the segfault error messages on lower node counts. > > mpif90 -O0 -fopenmp ./fred.f90 > > mpiexec -n 6 ./a.out > -- > mpiexec noticed that process rank 4 with PID 28402 on node mic2 exited on > signal 11 (Segmentation fault). > -- > jbray@mic2:intel-15_openmpi-1.8.3% mpiexec -n 12 ./a.out > > > > It was the silence that made me raise the issue here. I am running on a 12 > physical core hyperthreaded Xeon Phi. Is there something in OpenMPI that is > suppressing the messages, as I am getting 4/5 core files each time. > > John > > > > On 18 November 2014 04:24, Ralph Castain wrote: > > Just checked the head of the 1.8 branch (soon to be released as 1.8.4), > and confirmed the same results. I know the thread-multiple option is still > broken there, but will test that once we get the final fix committed. > > > > > > On Mon, Nov 17, 2014 at 7:29 PM, Ralph Castain wrote: > > FWIW: I don't have access to a Linux box right now, but I built the OMPI > devel master on my Mac using Intel 2015 compilers and was able to build/run > all of the Fortran examples in our "examples" directory. > > > > I suspect the problem here is your use of the --enable-mpi-thread-multiple > option. > The 1.8 series had an issue with that option - we are in the process of > fixing it (I'm waiting for an updated patch), and you might be hitting it. > > > > If you remove that configure option, do things then work? > > Ralph > > > > > > On Mon, Nov 17, 2014 at 5:56 PM, Gilles Gouaillardet < > gilles.gouaillar...@iferc.org> wrote: > > Hi John, > > do you MPI_Init() or do you MPI_Init_thread(MPI_THREAD_MULTIPLE) ? > > does your program calls MPI anywhere from an OpenMP region ? > does your program cal
[OMPI users] Converting --cpus-per-proc to --map-by for a hybrid code
To run a hybrid MPI/OpenMP code on a hyperthreaded machine with 24 virtual cores, I've been using -n 12 --cpus-per-proc 2 so I can use OMP_NUM_THREADS=2 I now see that --cpus-per-proc is deprecated in favour of --map-by, but I've been struggling to find a conversion as the --map-by documentation is not very clear. What should I use to bind 2 virtual cores to each process? After I use -n 12 --cpus-per-proc 2 I get A request was made to bind to that would result in binding more processes than cpus on a resource: Bind to: CORE Node:mic1 #processes: 2 #cpus: 1 and suggests I need an override option But this doesn't to match my request for 2 cores per process, almost the reverse, having 2 processes per core. I don't think I'm overloading my virtual cores anyway John
Re: [OMPI users] Converting --cpus-per-proc to --map-by for a hybrid code
Hi Ralph I have a motherboard with 2 X6580 chips, each with 6 cores 2 way hyperthreading, so /proc/cpuinfo reports 24 cores Doing a pure compute OpenMP loop where I'd expect the number of iterations in 10s to rise with number of threads with gnu and mpich OMP_NUM_THREADS=1 -n 1 : 112 iterations OMP_NUM_THREADS=2 -n 1 : 224 iterations OMP_NUM_THREADS=6 -n 1 : 644 iterations OMP_NUM_THREADS=12 -n 1 : 1287 iterations OMP_NUM_THREADS=22 -n 1 : 1182 iterations OMP_NUM_THREADS=24 -n 1 : 454 iterations which shows that mpich is spreading across the cores, but hyperthreading is not useful, and using the whole node counterproductive with gnu and openmpi 1.8.3 OMP_NUM_THREADS=1 mpiexec -n 1 : 112 OMP_NUM_THREADS=2 mpiexec -n 1 : 113 which suggests you aren't allowing the threads to spread across cores adding --cpus-per-node I gain access to the resources on one chip OMP_NUM_THREADS=1 mpiexec --cpus-per-proc 1 -n 1 : 112 OMP_NUM_THREADS=2 mpiexec --cpus-per-proc 2 -n 1 : 224 OMP_NUM_THREADS=6 mpiexec --cpus-per-proc 2 -n 1 : 644 then OMP_NUM_THREADS=12 mpiexec --cpus-per-proc 12 -n 1 A request for multiple cpus-per-proc was given, but a directive was also give to map to an object level that has less cpus than requested ones: #cpus-per-proc: 12 number of cpus: 6 map-by: BYNUMA So you aren't happy using both chips for one process OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 1 --use-hwthread-cpus : 112 OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112 OMP_NUM_THREADS=4 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224 OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 6 --use-hwthread-cpus : 324 OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 631 OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 647 OMP_NUM_THREADS=24 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus A request for multiple cpus-per-proc was given, but a directive was also give to map to an object level that has less cpus than requested ones: #cpus-per-proc: 24 number of cpus: 12 map-by: BYNUMA OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112 OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224 OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus :: 644 OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 24 --use-hwthread-cpus :: 644 A request for multiple cpus-per-proc was given, but a directive was also give to map to an object level that has less cpus than requested ones: #cpus-per-proc: 24 number of cpus: 12 map-by: BYNUMA So it seems that --use-hwthread-cpus means that --cpus-per-proc changes from physical cores to hyperthreaded cores, but I can't get both chips working on the problem in way mpich can John
Re: [OMPI users] Converting --cpus-per-proc to --map-by for a hybrid code
lstopo is pretty! John
Re: [OMPI users] Converting --cpus-per-proc to --map-by for a hybrid code
OMP_NUM_THREADS=1 mpiexec -n 1 gnu_openmpi_a/one_c_prof.exe : 113 iterations OMP_NUM_THREADS=6 mpiexec -n 1 --map-by node:PE=6 : 639 iterations OMP_NUM_THREADS=6 mpiexec -n 2 --map-by node:PE=6 : 639 iterations OMP_NUM_THREADS=12 mpiexec -n 1 --map-by node:PE=12 : 1000 iterations OMP_NUM_THREADS=12 mpiexec -n 2 --use-hwthread-cpus --map-by node:PE=12 : 646 iterations that's looking better, with limited gain for 1 process on 2 chips. Thanks. I am testing Allineas profiler, and our goal is to point out bad practice, so I need to run all sorts of pathological cases. Now to see what our software thinks Thanks for your help John On 8 December 2014 at 15:57, Ralph Castain wrote: > Thanks for sending that lstopo output - helped clarify things for me. I > think I now understand the issue. Mostly a problem of my being rather dense > when reading your earlier note. > > Try using —map-by node:PE=N to your cmd line. I think the problem is that > we default to —map-by numa if you just give cpus-per-proc and no mapping > directive as we know that having threads that span multiple numa regions is > bad for performance > > > > On Dec 5, 2014, at 9:07 AM, John Bray wrote: > > > > Hi Ralph > > > > I have a motherboard with 2 X6580 chips, each with 6 cores 2 way > hyperthreading, so /proc/cpuinfo reports 24 cores > > > > Doing a pure compute OpenMP loop where I'd expect the number of > iterations in 10s to rise with number of threads > > with gnu and mpich > > OMP_NUM_THREADS=1 -n 1 : 112 iterations > > OMP_NUM_THREADS=2 -n 1 : 224 iterations > > OMP_NUM_THREADS=6 -n 1 : 644 iterations > > OMP_NUM_THREADS=12 -n 1 : 1287 iterations > > OMP_NUM_THREADS=22 -n 1 : 1182 iterations > > OMP_NUM_THREADS=24 -n 1 : 454 iterations > > > > which shows that mpich is spreading across the cores, but hyperthreading > is not useful, and using the whole node counterproductive > > > > with gnu and openmpi 1.8.3 > > OMP_NUM_THREADS=1 mpiexec -n 1 : 112 > > OMP_NUM_THREADS=2 mpiexec -n 1 : 113 > > which suggests you aren't allowing the threads to spread across cores > > > > adding --cpus-per-node I gain access to the resources on one chip > > > > OMP_NUM_THREADS=1 mpiexec --cpus-per-proc 1 -n 1 : 112 > > OMP_NUM_THREADS=2 mpiexec --cpus-per-proc 2 -n 1 : 224 > > OMP_NUM_THREADS=6 mpiexec --cpus-per-proc 2 -n 1 : 644 > > then > > OMP_NUM_THREADS=12 mpiexec --cpus-per-proc 12 -n 1 > > > > A request for multiple cpus-per-proc was given, but a directive > > was also give to map to an object level that has less cpus than > > requested ones: > > > > #cpus-per-proc: 12 > > number of cpus: 6 > > map-by: BYNUMA > > > > So you aren't happy using both chips for one process > > > > OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 1 --use-hwthread-cpus : > 112 > > OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : > 112 > > OMP_NUM_THREADS=4 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : > 224 > > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 6 --use-hwthread-cpus : > 324 > > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : > 631 > > OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : > 647 > > > > OMP_NUM_THREADS=24 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus > > > > A request for multiple cpus-per-proc was given, but a directive > > was also give to map to an object level that has less cpus than > > requested ones: > > > > #cpus-per-proc: 24 > > number of cpus: 12 > > map-by: BYNUMA > > > > OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : > 112 > > OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : > 224 > > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus :: > 644 > > > > OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 24 --use-hwthread-cpus > :: 644 > > > > A request for multiple cpus-per-proc was given, but a directive > > was also give to map to an object level that has less cpus than > > requested ones: > > > > #cpus-per-proc: 24 > > number of cpus: 12 > > map-by: BYNUMA > > > > So it seems that --use-hwthread-cpus means that --cpus-per-proc changes > from physical cores to hyperthreaded cores, but I can't get both chips > working on the problem in way mpich can > > > > John > > > > > > > > > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25919.php > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25927.php
[OMPI users] Sample code using the more obscure MPI_Neighbor routines
I want to validate Allinea's profiler on the complete set of p2p and collective routines available in MPI and went looking for examples of the MPI_Neighbor routines in action. I can find a range of PDF tutorials and papers, but no actual sample codes to try. The OpenMPI examples for 1.8.4 don't use MPI_Neighbor routines, is there any plan to add them, or can people reccomend another program I can compile up? John