Will do,
Right now I have asked the user to try rebuilding with the newest openmpi just
to be safe.
Interesting behavior rank0 the ib counters (using collctl) never gets a packet
in, only packets out.
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro...@umich.edu
(734)936-1985
O
On Mar 21, 2012, at 11:34 AM, Brock Palen wrote:
> tcp with this code?
Does it matter enough for debugging runs?
> Can we disable the psm mtl and use the verbs emulation on qlogic? While the
> qlogic verbs isn't that great it is still much faster in my tests than tcp.
>
> Is there a particula
tcp with this code?
Can we disable the psm mtl and use the verbs emulation on qlogic? While the
qlogic verbs isn't that great it is still much faster in my tests than tcp.
Is there a particular reason to pick tcp?
Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro...@umich.edu
(734)
We unfortunately don't have much visibility into the PSM device (meaning: Open
MPI is a thin shim on top of the underlying libpsm, which handles all the MPI
point-to-point semantics itself). So we can't even ask you to run padb to look
at the message queues, because we don't have access to them
Forgotten stack as promised, it keeps changing at the lower level
opal_progress, but never moves above that.
[yccho@nyx0817 ~]$ padb -Ormgr=orte --all --stack-trace --tree --all
Stack trace(s) for thread: 1
-
[0-63] (64 processes)
-
main() at ?:?
Loci::makeQuery
I have a users code that appears to be hanging some times on MPI_Waitall(),
stack trace from padb below. It is on qlogic IB using the psm mtl.
Without knowing what requests go to which rank, how can I check that this code
didn't just get its self into a deadlock? Is there a way to get a reable