Thanks for the quick feedback. I opened an issue here:
https://github.com/open-mpi/ompi/issues/5383


Clyde Stanfield 
Software Engineer 
734-480-5100 office 
clyde.stanfi...@mdaus.com



 


-----Original Message-----
From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Nathan Hjelm 
via users
Sent: Friday, July 06, 2018 10:57 AM
To: Open MPI Users <users@lists.open-mpi.org>
Cc: Nathan Hjelm <hje...@me.com>
Subject: Re: [OMPI users] MPI_Ialltoallv

No, thats a bug. Please open an issue on github and we will fix it shortly.

Thanks for reporting this issue.

-Nathan

> On Jul 6, 2018, at 8:08 AM, Stanfield, Clyde 
> <clyde.stanfi...@radiantsolutions.com> wrote:
> 
> We are using MPI_Ialltoallv for an image processing algorithm. When doing 
> this we pass in an MPI_Type_contiguous with an MPI_Datatype of 
> MPI_C_FLOAT_COMPLEX which ends up being the size of multiple rows of the 
> image (based on the number of nodes used for distribution). In addition 
> sendcounts, sdispls, resvcounts, and rdispls all fit within a signed int. 
> Usually this works without any issues, but when we lower our number of nodes 
> we sometimes see failures.
> 
> What we found is that even though we can fit everything into signed ints, 
> line 528 of nbc_internal.h ends up calling a malloc with an int that appears 
> to be the size of the (num_distributed_rows * num_columns  * 
> sizeof(std::complex<float>)) which in very large cases wraps back to 
> negative.  As a result we end up seeing “Error in malloc()” (line 530 of 
> nbc_internal.h) throughout our output.
> 
> We can get around this issue by ensuring the sum of our contiguous type never 
> exceeds 2GB. However, this was unexpected to us as our understanding was that 
> all long as we can fit all the parts into signed ints we should be able to 
> transfer more than 2GB at a time. Is it intended that MPI_Ialltoallv requires 
> your underlying data to be less than 2GB or is this in error in how malloc is 
> being called (should be called with a size_t instead of an int)?
> 
> Thanks,
> Clyde Stanfield
> 
> <image001.jpg>
> Clyde Stanfield
> Software Engineer
> 734-480-5100 office
> clyde.stanfi...@mdaus.com
> <image002.png> <image003.png>
> 
> 
> 
> 
> The information contained in this communication is confidential, is intended 
> only for the use of the recipient(s) named above, and may be legally 
> privileged. If the reader of this message is not the intended recipient, you 
> are hereby notified that any dissemination, distribution, or copying of this 
> communication is strictly prohibited. If you have received this communication 
> in error, please re-send this communication to the sender and delete the 
> original message or any copy of it from your computer system. 
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to