Gretchen,
Could you please send stack-trace of the processes when it hangs? (with
padb/gdb)
Does the same problem persist in small scale (2,3 nodes)?
What is the minimal setup that reproduces the problem?
-- YK
>
> -- Forwarded message --
> From: *Gretchen* mailto:umassastroh..
Michael,
Could you try to run this again with "--mca mpi_leave_pinned 0" parameter?
I suspect that this might be due to a message size problem - MPI
tries to do RDMA with a message bigger than what HCA supports.
-- YK
On 11-Apr-11 7:44 PM, Michael Di Domenico wrote:
> Here's a chunk of code that