On May 24, 2010, at 19:42 , Gijsbert Wiesenekker wrote: > My MPI program consists of a number of processes that send 0 or more messages > (using MPI_Isend) to 0 or more other processes. The processes check > periodically if messages are available to be processed. It was running fine > until I increased the message size, and I got deadlock problems. Googling > learned I was running into a classic deadlock problem if (see for example > http://www.cs.ucsb.edu/~hnielsen/cs140/mpi-deadlocks.html). The workarounds > suggested like changing the order of MPI_Send and MPI_Recv do not work in my > case, as it could be that one processor does not send any message at all to > the other processes, so MPI_Recv would wait indefinitely. > Any suggestions on how to avoid deadlock in this case? > > Thanks, > Gijsbert >
An approach that seems to work in my case is the following: I was using separate message-tags for 'update_message' and 'no_more_messages'. All these were sent asynchronously. The receive code in pseudo-code looked like: -- if (probe_for_update_message() == FALSE) { if (probe_for_no_more_messages() == TRUE) { //we are done } else { //do some work } } else { //process update message } -- The problem with this receive code was that in between the probe_for_update_message() and the probe_for_no_more_messages() a processor could send several update messages, followed by 'no_more_messages', so I still needed to check for any pending update messages after a probe_for_no_more_messages(), which complicated handling deadlock. So I first created a special update message that signals 'no_more_messages', which simplified the receive code to: -- //probe_for_update_message() returns INVALID if no more messages, TRUE if message, FALSE if not if ((result = probe_for_update_message()) == INVALID) { //we are done } else if (result == TRUE) { //process update message } else //result == FALSE { //do some work } -- Now to deal with the deadlock I first created a function recv_update_message() that probes for update messages and pushes them onto a FIFO queue (for several reasons I cannot process the update message right away). In pseudo-code: -- int recv_update_message() { int result; if ((result = probe_for_update_message()) == TRUE) queue(update_message); return(result); } -- The asynchronous send code in pseudo-code looks like: -- MPI_Isend(update_message, &request); while(TRUE) { //deal with deadlock //I assume my deadlocks are caused by running out of system buffer space //hopefully polling pending update messages frees up buffer space recv_update_message(); MPI_Test(&request, &flag); if (flag) break; } -- The asynchronous receive code in pseudo-code looks like: -- //first check the FIFO queue if (dequeue(update_message)) return(TRUE); else { int result; if ((result = recv_update_message()) == INVALID) return(INVALID); if (result == TRUE) dequeue(update_message); return(result); } -- As a further refinement I use a queue per processor, and recv_update_message first tries to receive messages for the least used queues, but if deadlock is detected it tries to receive messages for all queues: -- MPI_Isend(update_message, &request); while(TRUE) nwaitx = 16; //threshold for deadlock nwait = 0; { if (nwait > 2 * nwaitx) { printf("possible deadlock detected\n"); nwaitx = nwait; recv_update_message(all_queues); } else { recv_update_message(least_used_queues_only); } MPI_Test(&request, &flag); if (flag) break; nwait++; } -- Gijsbert