I wonder if we can narrow this down a bit to perhaps a PML protocol
issue.
Start by disabling RDMA by using:
-mca btl_gm_flags 1
This helps some, I at least now see the start up of HPL, but i never
get a single pass, output ends at:
- Computational tests pass if scaled residuals are less
than 16.0
On the other-hand, with OB1 using btl_gm_flags 1 fixed the error
problem with OMPI! Which is a great first step.
mpirun -np 4 --mca btl_gm_flags 1 ./xhpl
Allowed HPL to run with no errors. I verified the performance was
better than when ran without gm
(added --mca btl ^gm )
So still a problem with DR which i dont need but im willing to help
test it.
Scott,
Can we look into why leaving RDMA on if causing a problem?
Brock
Let's see if that helps things out at all.
- Galen
Scott
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users