Assuming a correct implementation the described communication pattern
should work seamlessly.
Would it be possible to either share a reproducer or provide the execution
stack by attaching a debugger to the deadlocked application to see the
state of the different processes. I wonder if all processe
Hi,
A source of sudden deadlocks at larger scale can be a change of send behavior
from buffered to synchronous mode. You can try whether your application
deadlocks at smaller scale, if you replace all send by ssend (e.g., add`#define
MPI_Send MPI_Ssend` and `#define MPI_Isend MPI_Issend` after