Re: [OMPI users] Open MPI exited on signal 11 (Segmentation fault). Trying to run a script that uses Open MPI

2013-07-05 Thread Ralph Castain
Well, it's telling you that your program segfaulted - so I'd start with that, perhaps looking for any core it might have dropped. On Jul 4, 2013, at 8:36 PM, Rick White wrote: > Hello, > > I have this error: > mpiexec noticed that process rank 1 with PID 16087 on node server exited on > sign

Re: [OMPI users] Open MPI exited on signal 11 (Segmentation fault). Trying to run a script that uses Open MPI

2013-07-05 Thread Ed Blosch
Compile with -traceback and -check all if using Intel.   Otherwise find the right compiler options to check array bounds accesses and to dump a stack trace. Then compile debug and run that way. Assuming it fails, you probably will get good info on the source of the problem. If it doesn't fail th

Re: [OMPI users] example program "ring" hangs when running across multiple hardware nodes (SOLVED)

2013-07-05 Thread Jed O. Kaplan
Dear Gus, Thanks for your help - your clue solved my problem! The ultimate solution was to limit mpi communications to the local, unrouted subnet. I made this the default behavior of all users of my cluster by adding the following line to the bottom of my $prefix/etc/openmpi-mca-params.conf file

[OMPI users] using the xrc queues

2013-07-05 Thread Ben
I'm part of a team that maintains a global climate model running under mpi. Recently we have been trying it out with different mpi stacks at high resolution/processor counts. At one point in the code there is a large number of mpi_isends/mpi_recv (tens to hundreds of thousands) when data distrib

[OMPI users] How to select specific out of multiple interfaces for communication and support for heterogeneous fabrics

2013-07-05 Thread Michael Thomadakis
Hello OpenMPI We area seriously considering deploying OpenMPI 1.6.5 for production (and 1.7.2 for testing) on HPC clusters which consists of nodes with *different types of networking interfaces*. 1) Interface selection We are using OpenMPI 1.6.5 and was wondering how one would go about selectin

Re: [OMPI users] How to select specific out of multiple interfaces for communication and support for heterogeneous fabrics

2013-07-05 Thread Ralph Castain
I can't speak for MVAPICH - you probably need to ask them about this scenario. OMPI will automatically select whatever available transport that can reach the intended process. This requires that each communicating pair of processes have access to at least one common transport. So if a process t

Re: [OMPI users] How to select specific out of multiple interfaces for communication and support for heterogeneous fabrics

2013-07-05 Thread Michael Thomadakis
Sorry on the mvapich2 reference :) All nodes are attached over a common 1GigE network. We wish ofcourse that if a node-pair is connected via a higher-speed fabric *as well* (IB FDR or 10GigE) then that this would be leveraged instead of the common 1GigE. One question: suppose that we use nodes ha

Re: [OMPI users] How to select specific out of multiple interfaces for communication and support for heterogeneous fabrics

2013-07-05 Thread Ralph Castain
As long as the IB interfaces can communicate to each other, you should be fine. On Jul 5, 2013, at 3:26 PM, Michael Thomadakis wrote: > Sorry on the mvapich2 reference :) > > All nodes are attached over a common 1GigE network. We wish ofcourse that if > a node-pair is connected via a higher-s

[OMPI users] FW: checkpoint-restart of version 1.6.5

2013-07-05 Thread basma a.azeem
From: basmaabdelaz...@hotmail.com To: us...@open-mpi.org Subject: checkpoint-restart of version 1.6.5 Date: Fri, 5 Jul 2013 02:40:36 +0200 does open mpi 1.6.5 support checkpoint restart ( self or blcr) ? i did not find ompi-checkpoint or ompi-restart in the documentation list of version

Re: [OMPI users] checkpoint-restart of version 1.6.5

2013-07-05 Thread Ralph Castain
If you check the installed bin directory, you will find they are still there. Whether they work or not is something I wouldn't know - I suggest trying them to see. On Jul 5, 2013, at 6:05 PM, basma a.azeem wrote: > > > From: basmaabdelaz...@hotmail.com > To: us...@open-mpi.org > Subject: che

Re: [OMPI users] How to select specific out of multiple interfaces for communication and support for heterogeneous fabrics

2013-07-05 Thread Michael Thomadakis
Great ... thanks. We will try it out as soon as the common backbone IB is in place. cheers Michael On Fri, Jul 5, 2013 at 6:10 PM, Ralph Castain wrote: > As long as the IB interfaces can communicate to each other, you should be > fine. > > On Jul 5, 2013, at 3:26 PM, Michael Thomadakis > w