Re: [OMPI users] MPI_Get is slow with structs containing padding

2023-04-04 Thread Antoine Motte via users
Hi Nathan, Joseph, Thank you for your quick answers, I also noticed bad performance of MPI_Get when there are displacements in the datatype, not necessarily padding. So I'll keep in mind to declare the padding in my MPI_Datatype to allow MPI to copy it, and make the whole set of data contiguou

Re: [OMPI users] MPI_Get is slow with structs containing padding

2023-03-30 Thread Nathan Hjelm via users
That is exactly the issue. Part of the reason I have argued against MPI_SHORT_INT usage in RMA because even though it is padded due to type alignment we are still not allowed to operate on the bits between the short and the int. We can correct that one in the standard by adding the same languag

Re: [OMPI users] MPI_Get is slow with structs containing padding

2023-03-30 Thread Joseph Schuchart via users
Hi Antoine, That's an interesting result. I believe the problem with datatypes with gaps is that MPI is not allowed to touch the gaps. My guess is that for the RMA version of the benchmark the implementation either has to revert back to an active message packing the data at the target and send

Re: [OMPI users] MPI_Get is slow with structs containing padding

2023-03-30 Thread Nathan Hjelm via users
Yes. This is absolutely normal. When you give MPI non-contiguous data it has to break out down into one operation per contiguous region. If you have a non-RDMA network Ross can lead to very poor performance. With RDMA networks it will also be much slower than a contiguous get but lower overhead

[OMPI users] MPI_Get is slow with structs containing padding

2023-03-30 Thread Antoine Motte via users
Hello everyone, I recently had to code an MPI application where I send std::vector contents in a distributed environment. In order to try different approaches I coded both 1-sided and 2-sided point-to-point communication schemes, the first one uses MPI_Window and MPI_Get, the second one uses