Diego. Assuming you are not properly coding the solver. Write a problem so you know the exact solution. That is know A (a very simple non singular SDP) and x_ , where x_ !=0. Make x a linear function or a constant so it is super easy to spot where it is happening the bad x's. I assume A has the boundary conditions on it and you are not taking care of them outside of A. If not you will have to check those too. Then find b=A*x_. Given A and b, initialize x to zero. Then use your CG to find x. Display/plot your x or the error ( abs(x-x_) ) as you go on the iterations. Display them in a map where you can easily identify/locate your cores and the distribution of A,x and b. You will soon see where (eg. not exchanging data between your cores properly) your data x is not being properly updated. I assume you are also using an easy 1D or 2D partitioning of your data so you can easily spot the issues.
Good luck. Joshua PS. Usually you have to build a communication library for your algebra that you trust (thoroughly tested). Then you build your data types of the algebra bit a bit: scalar, vector, matrix. Then the operators (addition, product), and finally your solver : CG, BiCGSTAB, GMRESR,... ------ Original Message ------ Received: 05:58 PM CDT, 10/28/2015 From: Diego Avesani <diego.aves...@gmail.com> To: Open MPI Users <us...@open-mpi.org> Subject: Re: [OMPI users] single CPU vs four CPU result differences, is it normal? > dear Damin, > I wrote the solver by myself. I have not understood your answer. > > Diego > > > On 28 October 2015 at 23:09, Damien <dam...@khubla.com> wrote: > > > Diego, > > > > There aren't many linear solvers that are bit-consistent, where the answer > > is the same no matter how many cores or processes you use. Intel's version > > of Pardiso is bit-consistent and I think MUMPS 5.0 might be, but that's > > all. You should assume your answer will not be exactly the same as you > > change the number of cores or processes, although you should reach the same > > overall error tolerance in approximately the same number of iterations. > > > > Damien > > > > > > On 2015-10-28 3:51 PM, Diego Avesani wrote: > > > > dear Andreas, dear all, > > The code is quite long. It is a conjugate gradient algorithm to solve a > > complex system. > > > > I have noticed that when a do cycle is small, let's say > > do i=1,3 > > > > enddo > > > > the results are identical. If the cycle is big, let's say do i=1,20, the > > results are different and the difference increase with the number of > > iterations. > > > > What do you think? > > > > > > > > Diego > > > > > > On 28 October 2015 at 22:32, Andreas Schäfer <gent...@gmx.de> wrote: > > > >> On 22:03 Wed 28 Oct , Diego Avesani wrote: > >> > When I use a single CPU a get a results, when I use 4 CPU I get another > >> > one. I do not think that very is a bug. > >> > >> Sounds like a bug to me, most likely in your code. > >> > >> > Do you think that these small differences are normal? > >> > >> It depends on what small means. Floating point operations in a > >> computer are generally not commutative, so parallelization may in deed > >> lead to different results. > >> > >> > Is there any way to get the same results? is some align problem? > >> > >> Impossible to say without knowing your code. > >> > >> Cheers > >> -Andreas > >> > >> > >> -- > >> ========================================================== > >> Andreas Schäfer > >> HPC and Grid Computing > >> Department of Computer Science 3 > >> Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany > >> +49 9131 85-27910 > >> PGP/GPG key via keyserver > >> http://www.libgeodecomp.org > >> ========================================================== > >> > >> (\___/) > >> (+'.'+) > >> (")_(") > >> This is Bunny. Copy and paste Bunny into your > >> signature to help him gain world domination! > >> > >> _______________________________________________ > >> users mailing list > >> us...@open-mpi.org > >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > >> Link to this post: > >> http://www.open-mpi.org/community/lists/users/2015/10/27933.php > >> > > > > > > > > _______________________________________________ > > users mailing listus...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > Link to this post: http://www.open-mpi.org/community/lists/users/2015/10/27934.php > > > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > > http://www.open-mpi.org/community/lists/users/2015/10/27935.php > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: http://www.open-mpi.org/community/lists/users/2015/10/27936.php