Re: [OMPI users] divide-by-zero in mca_btl_openib_add_procs

2014-06-03 Thread Alain Miniussi
Please note that I had the problem with 13.1.0 but not with the 13.1.1 On 28/05/2014 00:47, Ralph Castain wrote: On May 27, 2014, at 3:32 PM, Alain Miniussi wrote: Unfortunately, the debug library works like a charm (which make the uninitialized variable issue more likely). Indeed - sounds

Re: [OMPI users] divide-by-zero in mca_btl_openib_add_procs

2014-06-03 Thread Ralph Castain
Yeah, I think we've concluded that this is just a bug in the compiler and not something wrong in OMPI itself. Sadly, compilers (just like all software) also have bugs. I'd just use the upgraded version as they apparently fixed the problem. On Jun 3, 2014, at 4:43 AM, Alain Miniussi wrote: >

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread Jeff Squyres (jsquyres)
I'm sorry it took so long -- I finally fixed this on the trunk and have scheduled this for the v1.8 branch. There were a small number of functions in the tkr interface that had ierr instead of ierror (some of the Dist_graph functions), which were probably added after the fixes were applied a ye

[OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-03 Thread Fischer, Greg A.
Hello openmpi-users, I'm running into a perplexing problem on a new system, whereby I'm experiencing intermittent segmentation faults when I run the ring_c.c example and use the openib BTL. See an example below. Approximately 50% of the time it provides the expected output, but the other 50% of

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread W Spector
Jeff Squyres wrote: > Did you find any other places where we accidentally had ierr instead of ierror? I will have to check the trunk and see. The only place I know of where the Standard wants IERR instead of IERROR is with the user-defined subroutines for MPI_KEYVAL_CREATE - which is depreca

Re: [OMPI users] ierr vs ierror in F90 mpi module

2014-06-03 Thread Jeff Squyres (jsquyres)
Ok. I think most were fixed after you reported them last year, but a few new MPI-3 functions were added after that, and they accidentally had "ierr" instead of "ierror". On Jun 3, 2014, at 11:47 AM, W Spector wrote: > Jeff Squyres wrote: > > Did you find any other places where we accidentall

Re: [OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-03 Thread Fischer, Greg A.
Apologies - I forgot to add some of the information requested by the FAQ: 1. OpenFabrics is provided by the Linux distribution: [binf102:fischega] $ rpm -qa | grep ofed ofed-kmp-default-1.5.4.1_3.0.76_0.11-0.11.5 ofed-1.5.4.1-0.11.5 ofed-doc-1.5.4.1-0.11.5 2. Linux Distro / Kernel:

Re: [OMPI users] can't preload binary to remote machine

2014-06-03 Thread Ralph Castain
Sorry for delayed response - been a little hectic here. I suspect the problem is that we really need a passwordless ssh connection in order to preload the file for 1.6.5. This isn't required in the 1.8 series, so you might want to try it with 1.8.1. Otherwise, resolve the password issue and it

Re: [OMPI users] intermittent segfaults with openib on ring_c.c

2014-06-03 Thread Ralph Castain
Sounds odd - can you configure OMPI --enable-debug and run it again? If it fails and you can get a core dump, could you tell us the line number where it is failing? On Jun 3, 2014, at 9:58 AM, Fischer, Greg A. wrote: > Apologies – I forgot to add some of the information requested by the FAQ: