Hi, although I did my due diligence on searching for this question,
I apologise if this is a repeat.
>From an architectural point of view does it make sense to use MPI in the
following scenario (for the purposes of resilience as much as
parallelization):
Each process is a long-running process (run
ion time, you might be
> better served for [extremely] long-running applications by using a
> simple (but resilient) sockets-based communication layer and not using
> MPI. I say this mainly because of the fault tolerance issues involved
> and the natural hardware MTBF values that we see on today&