Re: [OMPI users] MPI_AllReduce vs MPI_IAllReduce

2015-11-30 Thread Felipe .
Thanks for the reply, Ralph. Now I think it is clearer to me why it could be so much slower. The reason would be that the blocking algorithm for reduction has a implementation very different than the non-blocking. Since there are lots of ways to implement it, are there options to tune the non-blo

Re: [OMPI users] MPI_AllReduce vs MPI_IAllReduce

2015-11-27 Thread Ralph Castain
One thing you might want to keep in mind is that “non-blocking” doesn’t mean “asynchronous progress”. The API may not block, but the communications only progress whenever you actually call down into the library. So if you are calling a non-blocking collective, and then make additional calls int

Re: [OMPI users] MPI_AllReduce vs MPI_IAllReduce

2015-11-27 Thread Felipe .
>Try and do a variable amount of work for every process, I see non-blocking >as a way to speed-up communication if they arrive individually to the call. >Please always have this at the back of your mind when doing this. I tried to simplify the problem at the explanation. The "local_computation" is

Re: [OMPI users] MPI_AllReduce vs MPI_IAllReduce

2015-11-27 Thread Nick Papior
Try and do a variable amount of work for every process, I see non-blocking as a way to speed-up communication if they arrive individually to the call. Please always have this at the back of your mind when doing this. Surely non-blocking has overhead, and if the communication time is low, so will t