> This is a problem of numerical stability, and there is no solution for > such a problem in MPI. Usually, preconditioning the input matrix > improve the numerical stability.
It could be a numerical stability but this would imply that I have an ill- conditioned matrix. This is not my case. > If you read the MPI standard, there is a __short__ section about what > guarantees the MPI collective communications provide. There is only > one: if you run the same collective twice, on the same set of nodes > with the same input data, you will get the same output. In fact the > main problem is that MPI consider all default operations (MPI_OP) as > being commutative and associative, which is usually the case in real > world but not when floating point rounding is around. When you > increase the number of nodes, the data will be spread in smaller > pieces, which means more operations will have to be done in order to > achieve the reduction, i.e. more rounding errors might occur and so on. You could have a point if I would see these small differences in both matrices. I am solving the system Ax=b with the MUMPS libraries. The construction of the matrix A and the matrix-column b is distributed among np CPU. The matrix A is the same whether I use 2CPUs or np CPUs. The matrix b would slightly change if I use more than 2CPUs. My data are not spread in smaller pieces!! I am using the FEM to solve the system of equations, and I use MPI to partition the domain. Therefore, the data (i.e., the vector of unknowns) is the same in all the CPUs, and each CPU is constructing a portion of the matrices A,b. Then, in the host CPU I add all these pieces into A and b. Thank you, Vasilis > > Thanks, > george. > > On May 27, 2009, at 11:16 , vasilis wrote: > >> Rank 0 accumulates all the res_cpu values into a single array, > >> res. It > >> starts with its own res_cpu and then adds all other processes. When > >> np=2, that means the order is prescribed. When np>2, the order is no > >> longer prescribed and some floating-point rounding variations can > >> start > >> to occur. > > > > Yes you are right. Now, the question is why would these floating- > > point rounding > > variations occur for np>2? It cannot be due to a not prescribed > > order!! > > > >> If you want results to be more deterministic, you need to fix the > >> order > >> in which res is aggregated. E.g., instead of using MPI_ANY_SOURCE, > >> loop > >> over the peer processes in a specific order. > >> > >> P.S. It seems to me that you could use MPI collective operations to > >> implement what you're doing. E.g., something like: > > > > I could use these operations for the res variable (Will it make the > > summation > > any faster?). But, I can not use them for the other 3 variables. > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users