[OMPI users] Failure handling

2015-11-09 Thread Cristian RUIZ
Hello I'm still observing abnormal behavior of 'mpirun' in the presence of failures. I performed some test using a 32 phsycial machines. I run a NAS benchmark using just one MPI processes per machine. I inject faults by shut down the machines in two different ways: 1) logging into the machine

[OMPI users] Failure detection

2015-11-07 Thread Cristian RUIZ
Hello, I was studying how OpenMPI reacts to failures. I have a virtual infrastructure where failures can be emulated by turning off a given VM. Depending on the way the VM is turned off the 'mpirun' will be notified, either because it receives a signal or because some timeout is reached. In bo

Re: [OMPI users] strange behavior of MPI_wait() method

2015-07-28 Thread Cristian RUIZ
xtra time to be fixed, and if your application is communication intensive, these delays get propagated and you can end up with huge performance hit. Cheers, Gilles On Tuesday, July 28, 2015, Cristian RUIZ <mailto:cristian.r...@inria.fr>> wrote: Hello, I'm measurin

[OMPI users] strange behavior of MPI_wait() method

2015-07-28 Thread Cristian RUIZ
Hello, I'm measuring the overhead of using Linux container for HPC applications. To do so I was comparing the execution time of NAS parallel benchmarks on two infrastructures: 1) real: 16 real machines 2) container: 16 containers distributed over 16 real machines Each machine used is equippe