[OMPI users] Failure handling

2015-11-09 Thread Cristian RUIZ
Hello I'm still observing abnormal behavior of 'mpirun' in the presence of failures. I performed some test using a 32 phsycial machines. I run a NAS benchmark using just one MPI processes per machine. I inject faults by shut down the machines in two different ways: 1) logging into the machine

Re: [OMPI users] Failure detection

2015-11-07 Thread Cristian Camilo Ruiz Sanabria
n’t have started the MPI job. > > So something is clearly confused. > > > > On Nov 7, 2015, at 6:41 AM, Cristian RUIZ wrote: > > > > Hello, > > > > I was studying how OpenMPI reacts to failures. I have a virtual > > infrastructure where failures ca

[OMPI users] Failure detection

2015-11-07 Thread Cristian RUIZ
Hello, I was studying how OpenMPI reacts to failures. I have a virtual infrastructure where failures can be emulated by turning off a given VM. Depending on the way the VM is turned off the 'mpirun' will be notified, either because it receives a signal or because some timeout is reached. In bo

Re: [OMPI users] strange behavior of MPI_wait() method

2015-07-28 Thread Cristian RUIZ
is normal becuase I observe the same thing when I use real machine and the perfomance in this case is much better. [1] https://linuxcontainers.org/ On 07/28/2015 02:31 PM, Gilles Gouaillardet wrote: Cristian, If the message takes some extra time to land into the receiver, then MPI_Wait

[OMPI users] strange behavior of MPI_wait() method

2015-07-28 Thread Cristian RUIZ
Hello, I'm measuring the overhead of using Linux container for HPC applications. To do so I was comparing the execution time of NAS parallel benchmarks on two infrastructures: 1) real: 16 real machines 2) container: 16 containers distributed over 16 real machines Each machine used is equippe

[OMPI users] MPI_Init() time

2015-04-15 Thread cristian
Hello, I noticed when performing a profiling of an application that the MPI_init() function takes a considerable amount of time. There is a big difference when running 32 processes over 32 machines and 32 processes over 8 machines (Each machine has 8 cores). These are the results of the profil

Re: [OMPI users] Cygwin compilation problems for openmpi-1.8

2014-04-15 Thread Cristian Butincu
The first process to do so was:   Process name: [[33371,1],1]   Exit code:    65 -- On Sunday, April 13, 2014 7:33 PM, Marco Atzeri wrote: On 12/04/2014 18:42, Cristian Butincu wrote: > Hello. > > The latest precompiled

[OMPI users] Cygwin compilation problems for openmpi-1.8

2014-04-12 Thread Cristian Butincu
Hello. The latest precompiled version to date of openmpi for cygwin is 1.7.4-1. Because I got some runtime errors when trying to run simple MPI programs, I have decided to upgrade to openmpi-1.8. When trying to compile openmpi-1.8 under cygwin I get the following error during "make all" and th

Re: [OMPI users] mpirun exit status

2009-03-20 Thread Cristian KLEIN
ach call to ORTE_UPDATE_EXIT_STATUS, whether the low 8 bits are indeed non-zero, wouldn't it be wiser to have ORTE_UPDATE_EXIT_STATUS do the check? > > On Mar 19, 2009, at 10:58 AM, Cristian KLEIN wrote: > >> Hello everybody, >> >> I've been using OpenMPI 1.3's mpirun

[OMPI users] mpirun exit status

2009-03-19 Thread Cristian KLEIN
Hello everybody, I've been using OpenMPI 1.3's mpirun in Makefiles and observed that the exit status is not always the one I expect. For example, using an incorrect machinefile makes mpirun return 0, whereas a non-zero value would be expected: --- cut here --- masternode:~/grid/myTests/hellompi$