Hi Gilles. Thank you for ur reply. Here's some code:
// MASTER NODE printf("[%s][RECV] src=%d tag=%d\n", processor_name, src, hashtag); fflush(stdout); MPI_Request req; rs = MPI_Irecv(buf, count, MPI_DOUBLE, src, hashtag, comm, &req); MPI_Wait(&req, status); printf("[%s][RECV] src=%d tag=%d OK\n", processor_name, src, hashtag); fflush(stdout); // WORKER NODES printf("[exec_cmd] Send double buff to %d, %d\n", dest, msg_tag); fflush(stdout); int bufsize = msg_size * sizeof(double) + MPI_BSEND_OVERHEAD; double * buff = malloc(bufsize); MPI_Buffer_attach(buff, bufsize); MPI_Bsend(rec_msg, msg_size, MPI_DOUBLE, dest, msg_tag, comm); MPI_Buffer_detach(buff, &bufsize); printf("[exec_cmd] Send double buff to %d, %d OK\n", dest, msg_tag); fflush(stdout); //Attempt with Isend //MPI_Request req; //MPI_Status status; //MPI_Isend(rec_msg, msg_size, MPI_DOUBLE, dest, msg_tag, comm, &req); //MPI_Wait(&req, &status); Output log: Sending 91 rows to task 9 offset=728 Sending 91 rows to task 10 offset=819 Sending 90 rows to task 11 offset=910 Received results from task 1 [exec_cmd] Send to 0, 508 [exec_cmd] Send to 0, 508 OK [exec_cmd] Send to 0, 508 [exec_cmd] Send to 0, 508 OK [exec_cmd] Send double buff to 0, 508 [exec_cmd] Send to 0, 510pir [exec_cmd] Send to 0, 510 OK [exec_cmd] Send to 0, 510 [exec_cmd] Send to 0, 510 OK [exec_cmd] Send double buff to 0, 510 Received results from task 2 Received results from task 3 [controller][RECV] src=4 tag=506 output hangs here.... Is there any way to instrument this to assess If the problem is actually on the receive end or at the send part? Regards, Carlos. On Wed, Mar 27, 2019 at 11:13 AM Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Carlos, > > can you post a trimmed version of your code that evidences the issue ? > > Keep in mind that if you want to write MPI code that is correct with > respect to the standard, you should assume MPI_Send() might block until a > matching receive is posted. > > Cheers, > > Gilles > > Sent from my iPod > > On Mar 27, 2019, at 20:46, carlos aguni <aguni...@gmail.com> wrote: > > Not "MPI_Send from 0".. > MPI_Send from 1 to 0 > MPI_Send from 7 to 0 > And so on.. > > On Wed, Mar 27, 2019, 8:43 AM carlos aguni <aguni...@gmail.com> wrote: > >> Hi all. >> >> I've an MPI application in which at one moment one rank receives a slice >> of an array from the other nodes. >> Thing is that my application hangs there. >> >> One thing I could get from printint out logs are: >> (Rank 0) Starts MPI_Recv from source 4 >> But then it receives: >> MPI_Send from 0 >> MPI_Send from 1 >> ... From 10 >> ... From 7 >> ... From 6 >> >> Then at one neither of them are responding. >> The message is a double array type of size 100.000. >> Later it would receive the message from 4. >> >> So i assume the buffer on the Recv side is overflowing. >> >> Few tests: >> - Using smaller array size works >> - alreay tried using isend. Irecv. Bsend. And the ranks still get stuck. >> >> So that leaves me to a few questions rather than how to solve this issue: >> - how can i know the size of mpi's interbal buffer? >> - how would one debug this? >> >> Regards, >> Carlos. >> > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users