Hello
I'm still observing abnormal behavior of 'mpirun' in the presence of
failures. I performed some test using a 32 phsycial machines. I run a
NAS benchmark using just one MPI processes per machine.
I inject faults by shut down the machines in two different ways:
1) logging into the machine
Hello,
I was studying how OpenMPI reacts to failures. I have a virtual
infrastructure where failures can be emulated by turning off a given VM.
Depending on the way the VM is turned off the 'mpirun' will be notified,
either because it receives a signal or because some timeout is reached.
In bo
xtra time to be fixed, and if your application
is communication intensive, these delays get propagated and you can
end up with huge performance hit.
Cheers,
Gilles
On Tuesday, July 28, 2015, Cristian RUIZ <mailto:cristian.r...@inria.fr>> wrote:
Hello,
I'm measurin
Hello,
I'm measuring the overhead of using Linux container for HPC
applications. To do so I was comparing the execution time of NAS
parallel benchmarks on two infrastructures:
1) real: 16 real machines
2) container: 16 containers distributed over 16 real machines
Each machine used is equippe