On Wednesday 27 of May 2009 7:47:06 pm Damien Hocking wrote: > I've seen this behaviour with MUMPS on shared-memory machines as well > using MPI. I use the iterative refinement capability to sharpen the > last few digits of the solution ( 2 or 3 iterations is usually enough). > If you're not using that, give it a try, it will probably reduce the > noise you're getting in your results. The quality of the answer from a > direct solve is highly dependent on the matrix scaling and pivot order > and it's easy to get differences in the last few digits. MUMPS itself > is also asynchronous, and might not be completely deterministic in how > it solves if MPI processes can run in a different order.
I set the maximum step of refinement to 5. It did change the solution, but it is not the same when I run it with 2CPUs > Damien > > George Bosilca wrote: > > This is a problem of numerical stability, and there is no solution for > > such a problem in MPI. Usually, preconditioning the input matrix > > improve the numerical stability. > > > > If you read the MPI standard, there is a __short__ section about what > > guarantees the MPI collective communications provide. There is only > > one: if you run the same collective twice, on the same set of nodes > > with the same input data, you will get the same output. In fact the > > main problem is that MPI consider all default operations (MPI_OP) as > > being commutative and associative, which is usually the case in real > > world but not when floating point rounding is around. When you > > increase the number of nodes, the data will be spread in smaller > > pieces, which means more operations will have to be done in order to > > achieve the reduction, i.e. more rounding errors might occur and so on. > > > > Thanks, > > george. > > > > On May 27, 2009, at 11:16 , vasilis wrote: > >>> Rank 0 accumulates all the res_cpu values into a single array, res. It > >>> starts with its own res_cpu and then adds all other processes. When > >>> np=2, that means the order is prescribed. When np>2, the order is no > >>> longer prescribed and some floating-point rounding variations can start > >>> to occur. > >> > >> Yes you are right. Now, the question is why would these > >> floating-point rounding > >> variations occur for np>2? It cannot be due to a not prescribed order!! > >> > >>> If you want results to be more deterministic, you need to fix the order > >>> in which res is aggregated. E.g., instead of using MPI_ANY_SOURCE, > >>> loop > >>> over the peer processes in a specific order. > >>> > >>> P.S. It seems to me that you could use MPI collective operations to > >>> implement what you're doing. E.g., something like: > >> > >> I could use these operations for the res variable (Will it make the > >> summation > >> any faster?). But, I can not use them for the other 3 variables. > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users