Re: [OMPI users] Nondeterministic SIGSEGV in MPI_Send to dynamically created processes

2016-02-19 Thread George Bosilca
Arthur, Your email does not contain enough information to pinpoint the problem. However, there are several hints that tent to indicate a problem in your application. 1. in the collective communication that succeed, the MPI_Intercomm_merge, the processes are doing [at least] one MPI_Allreduce foll

Re: [OMPI users] Nondeterministic SIGSEGV in MPI_Send to dynamically created processes

2016-02-19 Thread Gilles Gouaillardet
Artur, do you check all the error codes returned by MPI_Comm_spawn_multiple ? (so you can confirm the requested number of tasks was spawned) since the error occurs only on the first MPI_Send, you might want to retrieve rank and size and print them right before MPI_Send, just to make sure the comm

Re: [OMPI users] Nondeterministic SIGSEGV in MPI_Send to dynamically created processes

2016-02-19 Thread Gilles Gouaillardet
Artur, in OpenMPI, MPI_Comm is an opaque pointer, so strictly speaking, high value might not be an issue. can you have your failed processes generate a core and post the stack trace ? btw, do you MPI_Send on the intra communicator created by MPI_Intercomm_merge ? what is the minimal config neede

[OMPI users] Nondeterministic SIGSEGV in MPI_Send to dynamically created processes

2016-02-19 Thread Artur Malinowski
Hi, I have a problem with my application that is based on dynamic process management. The scenario related to process creation is as follows: 1. All processes call MPI_Comm_spawn_multiple to spawn additional single process per each node. 2. Parent processes call MPI_Intercomm_merge. 3. C

Re: [OMPI users] Error building openmpi-dev-3498-gdc4d3ed on Solaris

2016-02-19 Thread Ralph Castain
Just pushed a change that renamed the field - hopefully fixed now Thanks! > On Feb 19, 2016, at 9:54 AM, Dave Love wrote: > > Gilles Gouaillardet writes: > >> a field from orte_iof_proc_t is named "stdin" >> could stdin be #defined under the hood in Solaris ? > > It's defined as "(&__iob[0])

Re: [OMPI users] Error building openmpi-dev-3498-gdc4d3ed on Solaris

2016-02-19 Thread Dave Love
Gilles Gouaillardet writes: > a field from orte_iof_proc_t is named "stdin" > could stdin be #defined under the hood in Solaris ? It's defined as "(&__iob[0])" on Solaris 10; it's just #defined differently by glibc. See stdio.h(7posix).

Re: [OMPI users] Error building openmpi-dev-3498-gdc4d3ed on Solaris

2016-02-19 Thread Gilles Gouaillardet
a field from orte_iof_proc_t is named "stdin" could stdin be #defined under the hood in Solaris ? if so, then renaming this field should do the trick. I will double check that on Monday Cheers, Gilles On Saturday, February 20, 2016, Ralph Castain wrote: > I’m afraid I have no idea what Solari

Re: [OMPI users] Error building openmpi-dev-3498-gdc4d3ed on Solaris

2016-02-19 Thread Ralph Castain
I’m afraid I have no idea what Solaris is complaining about here. > On Feb 19, 2016, at 6:52 AM, Siegmar Gross > wrote: > > Hi, > > yesterday I tried to build openmpi-dev-3498-gdc4d3ed on my > machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux > 12.1 x86_64) with gcc-5.1.0 and S

[OMPI users] Error building openmpi-dev-3498-gdc4d3ed on Solaris

2016-02-19 Thread Siegmar Gross
Hi, yesterday I tried to build openmpi-dev-3498-gdc4d3ed on my machines (Solaris 10 Sparc, Solaris 10 x86_64, and openSUSE Linux 12.1 x86_64) with gcc-5.1.0 and Sun C 5.13. I was successful on my Linux machine, but I got the following errors on both Solaris platforms. Sun C 5.13: ===

Re: [OMPI users] mpirun hanging after MPI_Abort

2016-02-19 Thread Ralph Castain
Best options for debugging something like this are: -mca odls_base_verbose 5 -mca errmgr_base_verbose 5 It’ll generate a fair amount of output, so try to do it with a small job if you can. You’ll need a build configured with -enable-debug to get the output. > On Feb 18, 2016, at 8:29 PM, Ben M