Gretchen,

Could you please send stack-trace of the processes when it hangs? (with 
padb/gdb)
Does the same problem persist in small scale (2,3 nodes)?
What is the minimal setup that reproduces the problem?

-- YK

> 
> ---------- Forwarded message ----------
> From: *Gretchen* <umassastroh...@gmail.com <mailto:umassastroh...@gmail.com>>
> Date: Mon, Mar 28, 2011 at 8:35 PM
> Subject: Re: [OMPI users] gadget2 infiniband openmpi hang
> To: us...@open-mpi.org <mailto:us...@open-mpi.org>
> 
> 
> The gadget code hangs at the same spot (i.e. number of steps completed AND 
> same section of code) when I run with --mca btl_openib_cpc_include rdmacm
> (code is doing  MPI_Sendrecv).
> Thanks,
> Gretchen
> 
> 
>     Date: Thu, 17 Mar 2011 12:45:32 -0400
>     From: Jeff Squyres <jsquy...@cisco.com <mailto:jsquy...@cisco.com>>
>     Subject: Re: [OMPI users] gadget2 infiniband openmpi hang
>     To: Open MPI Users <us...@open-mpi.org <mailto:us...@open-mpi.org>>
>     Message-ID: <c03801dd-a057-4544-a365-f24836879...@cisco.com 
> <mailto:c03801dd-a057-4544-a365-f24836879...@cisco.com>>
>     Content-Type: text/plain; charset=us-ascii
> 
>     Are you able to run if you use --mca btl_openib_cpc_include rdmacm ?
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

Reply via email to