Re: [OMPI users] InfiniBand path migration not working

2012-03-21 Thread Jeremy
Hi Pasha, I just wanted to check if you had any further suggestions regarding the APM issue based on the updated info in my previous email. Thanks, -Jeremy On Mon, Mar 12, 2012 at 12:43 PM, Jeremy wrote: > Hi Pasha, Yevgeny, > >>> My educated guess is that from some reason it is no direct conn

Re: [OMPI users] InfiniBand path migration not working

2012-03-21 Thread Shamis, Pavel
Jeremy, As far as I understand the tool that Evgeny recommended showed that the remote port is reachable. Based on the log that have been provided I can't find the issue in ompi, everything seems to be kosher. Unfortunately, I do not have a platform where I may try to reproduce the issue. I wo

[OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Brock Palen
I have a users code that appears to be hanging some times on MPI_Waitall(), stack trace from padb below. It is on qlogic IB using the psm mtl. Without knowing what requests go to which rank, how can I check that this code didn't just get its self into a deadlock? Is there a way to get a reable

Re: [OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Brock Palen
Forgotten stack as promised, it keeps changing at the lower level opal_progress, but never moves above that. [yccho@nyx0817 ~]$ padb -Ormgr=orte --all --stack-trace --tree --all Stack trace(s) for thread: 1 - [0-63] (64 processes) - main() at ?:? Loci::makeQuery

Re: [OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Jeffrey Squyres
We unfortunately don't have much visibility into the PSM device (meaning: Open MPI is a thin shim on top of the underlying libpsm, which handles all the MPI point-to-point semantics itself). So we can't even ask you to run padb to look at the message queues, because we don't have access to them

Re: [OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Brock Palen
tcp with this code? Can we disable the psm mtl and use the verbs emulation on qlogic? While the qlogic verbs isn't that great it is still much faster in my tests than tcp. Is there a particular reason to pick tcp? Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro...@umich.edu (734)

Re: [OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Jeffrey Squyres
On Mar 21, 2012, at 11:34 AM, Brock Palen wrote: > tcp with this code? Does it matter enough for debugging runs? > Can we disable the psm mtl and use the verbs emulation on qlogic? While the > qlogic verbs isn't that great it is still much faster in my tests than tcp. > > Is there a particula

Re: [OMPI users] MPI_Waitall hangs and querying

2012-03-21 Thread Brock Palen
Will do, Right now I have asked the user to try rebuilding with the newest openmpi just to be safe. Interesting behavior rank0 the ib counters (using collctl) never gets a packet in, only packets out. Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro...@umich.edu (734)936-1985 O