Re: [OMPI users] Python code inconsistency on complex multiplication in MPI (MPI4py)

Ben Menadue Tue, 22 May 2018 16:52:37 -0700

Hi Jeff, Konstantinos,

I think you might want MPI.C_DOUBLE_COMPLEX for your datatype, since 
np.complex128 is a double-precision. But I think it’s either ignoring this and 
using the datatype of the object you’re sending or MPI4py is handling the 
conversion in the backend somewhere. You could actually just drop the datatype 
specification and let MPI4py select the datatype for you, as you do on the 
receiver side.


Modifying Jeff’s script to print out the product on the sender side as well, I 
see this:

Sender computed (first):
[[-7.97922801e+16+28534416.j]]
Receiver computed (first):
[[-7.97922801e+16+28534416.j]]
Sender computed (second):
[[-7.97922802e+16+48.j]]
Receiver computed (second):
[[-7.97922802e+16+48.j]]

Even the real part of the result is slightly different between the two 
approaches (as is the case for your results). So the values are probably being 
sent correctly, it’s just that the values that are being sent are different. 
Adding np.set_printoptions(precision=20) to the program shows this:

Sender sent (first):
[[28534314.10478439+28534314.10478436j]]
[[-1.3981811475968072e+09+1.3981811485968091e+09j]]
Sender sent (second):
[[28534314.10478439+28534314.10478436j]]
[[-1.39818115e+09+1.39818115e+09j]]

If the second value is what you expect from your construction algorithm, then I 
suspect you’re just seeing natural floating-point precision loss inside only of 
the functions you’re calling there. Otherwise, if you made the second input by 
copying the output from the first, you just didn’t copy enough decimal places 
:-) .

Cheers,
Ben


> On 23 May 2018, at 8:38 am, Konstantinos Konstantinidis 
> <kostas1...@gmail.com> wrote:
> 
> Thanks Jeff.
> 
> I ran your code and saw your point. Based on that, it seems that my 
> comparison by just printing the values was misleading.
> 
> I have two questions for you:
> 
> 1. Can you please describe your setup i.e. Python version, Numpy version, 
> MPI4py version and Open MPI version? I'm asking since I am thinking of doing 
> a fresh build and trying Python 3. What do you think?
> 
> 2. When I try the following code (which manually computes the imaginary part 
> of that same complex number) at any receiver:
> 
> C_imag = np.dot(-28534314.10478436, 1.39818115e+09) + 
> np.dot(28534314.10478439, 1.39818115e+09)
> print(C_imag)
> 
> I see that the answer is 48 which is correct. Do you think that this fact 
> points to MPI4py as the source of the precision loss problem, instead of 
> numpy?
> 
> Honestly, I don't understand how they have that serious bugs unresolved.
> 
> On Tue, May 22, 2018 at 5:05 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com 
> <mailto:jsquy...@cisco.com>> wrote:
> There are two issues:
> 
> 1. You should be using MPI.C_COMPLEX, not MPI.COMPLEX.  MPI.COMPLEX is a 
> Fortran datatype; MPI.C_COMPLEX is the C datatype (which is what NumPy is 
> using behind the scenes).
> 
> 2. Somehow the received B values are different between the two.
> 
> I derived this program from your two programs to show the difference:
> 
>     https://gist.github.com/jsquyres/2ed86736e475e9e9ccd08b66378ef968 
> <https://gist.github.com/jsquyres/2ed86736e475e9e9ccd08b66378ef968>
> 
> I don't know offhand how mpi4py sends floating point values -- but I'm 
> guessing that either mpi4py or numpy are pickling the floating point values 
> (vs. sending the exact bitmap of the floating point value), and some 
> precision is being lost either in the pickling or the de-pickling.  That's a 
> guess, though.
> 
> 
> 
> > On May 22, 2018, at 2:51 PM, Konstantinos Konstantinidis 
> > <kostas1...@gmail.com <mailto:kostas1...@gmail.com>> wrote:
> > 
> > Assume an Python MPI program where a master node sends a pair of complex 
> > matrices to each worker node and the worker node is supposed to compute 
> > their product (conventional matrix product). The input matrices are 
> > constructed at the master node according to some algorithm which there is 
> > no need to explain. Now imagine for simplicity that we have only 2 MPI 
> > processes, one master and one worker. I have created two versions of this 
> > program for this case. The first one constructs two complex numbers (1-by-1 
> > matrices for simplicity) and sends them to the worker to compute the 
> > product. This program is like a skeleton for what I am trying to do with 
> > multiple workers. In the second program, I have omitted the algorithm and 
> > have just hard-coded these two complex numbers into the code. The programs 
> > are supposed to give the same product shown here:
> > 
> > a = 28534314.10478439+28534314.10478436j
> > 
> > b = -1.39818115e+09+1.39818115e+09j
> > 
> > a*b = -7.97922802e+16+48j
> > 
> > This has been checked in Matlab. Instead, the first program does not work 
> > and the worker gives a*b = -7.97922801e+16+28534416.j while the second 
> > program works correctly. Please note that the data is transmitted correctly 
> > from the master to the worker and the data structures are the same in both 
> > cases (see the print() functions). 
> > 
> > The first (wrong) program is program1.py and the second (correct) is 
> > program2.py
> > 
> > I am using MPI4py 3.0.0. along with Python 2.7.14 and the kernel of Open 
> > MPI 2.1.2. I have been straggling with this problem for a whole day and 
> > still cannot figure out what's going on. I have tried numerous 
> > initializations like np.zeros(), np.zeros_like(), np.empty_like() as well 
> > as both np.array and np.matrix and functions np.dot(), np.matmul() and the 
> > operator *. 
> > 
> > Finally, I think that the problem is always with the imaginary part of the 
> > product based on other examples I tried. Any suggestions?
> > <program1.py><program2.py>_______________________________________________
> > users mailing list
> > users@lists.open-mpi.org <mailto:users@lists.open-mpi.org>
> > https://lists.open-mpi.org/mailman/listinfo/users 
> > <https://lists.open-mpi.org/mailman/listinfo/users>
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com <mailto:jsquy...@cisco.com>
> 
> 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Python code inconsistency on complex multiplication in MPI (MPI4py)

Reply via email to