Gus Correa <g...@ldeo.columbia.edu> writes: > Or run a serial version on the same set of machines, > compiled in similar ways (compiler version, opt flags, etc) > to the parallel versions, and compare results. > If the results don't differ, then you can start blaming MPI.
That wouldn't show that there's actually any OpenMPI-specific problem, though -- the parallelism potentially introduces indeterminacy. [I don't mean to imply Guy thinks otherwise, or that anyone has enough information to guess what's actually happening.] General discussion of numerical issues and scientific computing war stories must be way off-topic here...