Hi Terry,
> How does the stack for the non-SM BTL run look, I assume it probably is the > same? Also, can you dump the message queues for rank 1? What's interesting > is you have a bunch of pending receives, do you expect that to be the case > when the MPI_Gatherv occurred? It turns out we have an unbalanced MPI_Bcast buried very deep in the application. After fixing that bug, the application behaves correctly. Thank you all for the help, and sorry for the false alarm. Teng