Re: [OMPI users] Hang in MPI_Abort

2016-07-01 Thread Thomas Jahns
Hello, On 06/30/2016 11:00 PM, Ralph Castain wrote: ==8518== Conditional jump or move depends on uninitialised value(s) ==8518==at 0x401C728: index (in /usr/lib/ld-2.23.90.so) This might be a red herring, but why is index from ld-2.23.90.so? Shouldn't that be a libc function from libc.so.

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Ralph Castain
Rats - and this only happens on arm32? > On Jun 30, 2016, at 1:56 PM, Orion Poplawski wrote: > > On 06/30/2016 02:55 PM, Orion Poplawski wrote: >> valgrind output: >> >> $ valgrind mpiexec -n 6 ./testphdf5 >> ==8518== Memcheck, a memory error detector >> ==8518== Copyright (C) 2002-2015, and GN

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Orion Poplawski
On 06/30/2016 02:55 PM, Orion Poplawski wrote: > valgrind output: > > $ valgrind mpiexec -n 6 ./testphdf5 > ==8518== Memcheck, a memory error detector > ==8518== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. > ==8518== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Orion Poplawski
valgrind output: $ valgrind mpiexec -n 6 ./testphdf5 ==8518== Memcheck, a memory error detector ==8518== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==8518== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==8518== Command: mpiexec -n 6 ./testphdf5 ==8518== =

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Ralph Castain
So the application procs are all gone, but mpiexec isn’t exiting? I’d suggest running valgrind, given the corruption. > On Jun 30, 2016, at 10:21 AM, Orion Poplawski wrote: > > On 06/30/2016 10:33 AM, Orion Poplawski wrote: >> No, just mpiexec is running. single node. Only see it when the tes

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Orion Poplawski
On 06/30/2016 10:33 AM, Orion Poplawski wrote: > No, just mpiexec is running. single node. Only see it when the test is > executed with "make check", not seeing it if I just run mpiexec -n 6 > ./testphdf5 by hand. Hmm, now I'm seeing it running mpiexec by hand. Trying to check it via gdb indic

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Orion Poplawski
No, just mpiexec is running. single node. Only see it when the test is executed with "make check", not seeing it if I just run mpiexec -n 6 ./testphdf5 by hand. On 06/30/2016 09:58 AM, Ralph Castain wrote: > Are the procs still alive? Is this on a single node? > >> On Jun 30, 2016, at 8:49 AM,

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Ralph Castain
Are the procs still alive? Is this on a single node? > On Jun 30, 2016, at 8:49 AM, Orion Poplawski wrote: > > I'm seeing hangs when MPI_Abort is called. This is with openmpi 1.10.3. e.g: > > program output: > > Testing -- big dataset test (bigdset) > Proc 3: *** Parallel ERROR *** >VRF

Re: [OMPI users] Hang in MPI_Abort

2016-06-30 Thread Orion Poplawski
On 06/30/2016 09:49 AM, Orion Poplawski wrote: > I'm seeing hangs when MPI_Abort is called. This is with openmpi 1.10.3. e.g: I'll also note that I'm seeing this on 32-bit arm, but not i686 or x86_64. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA, Boulder/CoR