George Bosilca wrote:
This is a problem of numerical stability, and there is no solution for such a problem in MPI. Usually, preconditioning the input matrix improve the numerical stability.
At the level of this particular e-mail thread, the issue seems to me to be different. Results are added together in some arbitrary order and there are variations on order of 10^-10. This is not an issue of numerical stability, but just of bitwise floating-point reproducibility.
And, given that one could fix the order (by using explicit source processes instead of MPI_ANY_SOURCE), one could "fix" this particular problem in MPI.
Anyhow, I'm just picking nits here.