You may be looking for BigMPI ( https://github.com/jeffhammond/BigMPI). The README and paper linked therein will tell you everything you need to know.
However, BigMPI solves a correctness problem, not a performance one. If you want to get the faster collectives for multi-gigabyte buffers, you should roll your own pipelined implementation. Jeff On Tue, Nov 28, 2017 at 4:05 PM Konstantinos Konstantinidis < kostas1...@gmail.com> wrote: > Hi, I have a communicator, say *comm*, and some shuffling of data takes > place within its nodes. > > I have implemented the shuffling with broadcasts but now I am trying to > experiment with MPI_Allgather() and MPI_Allgatherv(). > > For demonstration purposes I am adding here a small part of the C++ code. > You can assume that *no_keys* keys each of size *keysize *unsigned char are > locally stored at some structure named *endata, *at each node. > > *unsigned keysize = 100;* > *unsigned long long no_keys = 10;* > *// unsigned long long bytes_send_count = no_keys*keysize; * > *int bytes_send_count = (int)no_keys*keysize; * > > *unsigned int commSize = (unsigned)comm.Get_size();* > > *// unsigned long long* recv_counts = new unsigned long long[commSize];* > *int* recv_counts = new int[commSize];* > *// unsigned long long* displs = new unsigned long long[commSize];* > *int* displs = new int[commSize]; * > *//Εxchange amount of data* > *// comm.Allgather(&bytes_send_count, 1, MPI::UNSIGNED_LONG_LONG, > recv_counts, 1, MPI::UNSIGNED_LONG_LONG); * > *comm.Allgather(&bytes_send_count, 1, MPI::INT, recv_counts, 1, > MPI::INT); * > > *unsigned long long total = 0; * > *for(unsigned int i = 0; i < commSize; i++){ * > * displs[i] = total; * > * total += recv_counts[i]; * > *}* > > *unsigned char* recv_buf = new unsigned char[total];* > *//Exchange actual data* > *comm.Allgatherv(&endata, no_keys*keysize, MPI::UNSIGNED_CHAR, recv_buf, > recv_counts, displs, MPI::UNSIGNED_CHAR);* > > My problem is that the number of keys is actually big and cannot fit into > an *int*, that's why it is defined as an *unsigned long long*. So, what I > would like to do is basically the commented lines of code but I get the > error that there is no match to this function if I use *unsigned long > long*. Is this true, and if yes why is that? I don't understand why > Bcast() supports unsigned long long while Allgatherv() does not. > > I would like to avoid setting up multiple Allgatherv() calls for this > since the number of connections that are initiated based on my algorithm is > quite big and I am afraid that this will create further delays. Actually, > this is the reason I am trying to replace Bcast() and try other things. > > I am using Open MPI 2.1.2 and testing on a single computer with 7 MPI > processes. The ompi_info is the attached file. > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Hammond jeff.scie...@gmail.com http://jeffhammond.github.io/
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users