Hi Christian,
I would suggest using mvapich2 instead. It is supposedly faster than OpenMpi on
infiniband and it seems to have fewer options under the hood which means less
things you have to tweak to get it working for you.
Regards,
Emyr James
Head of Scientific IT
CRG -Centre for Genomic
running Scientific Linux release 7.2.
Regards,
Emyr James
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
On 01/10/2015 10:24, Emyr James wrote:
"ORTE has lost communication with its daemon located on node:
hostname: node123
This is usually due to either a failure of the TCP network
connection to the node, or possibly an internal failure of
the daemon itself. We cannot recover from
Hi,
I am using openmpi with Platform LSF on our cluster that has 10Gbe
connectivity.
Sometimes things work fine but we get a lot of occurences of mpi jobs
not getting off the ground and the following appears in the log...
"ORTE has lost communication with its daemon located on node:
hostna