What version of OMPI are you using? That error message looks like something from an ancient version - might be worth updating.
On Dec 13, 2010, at 4:04 AM, peifan wrote: > i have 3 nodes, one is master node and another is computing nodes,these nodes > deployed in the internet (not in cluster) > > when i running NPB (NASA parallel benchmark) in one node (use 2 processes) > mpirun -np 2 exe. > I can get the successful result, but when i running in two nodes(for example > running on B and C nodes) i got a fail > mprirun -nolocal -hostfile hostfile -np 2 exe. > the fail information is : > B [0,1,0] connectimeout ,connect() fail errno=110 > C [0,1,1] connectimeout ,connect() fail errno=110 > but the connect between B and C has no problem, because i can use ping and > ssh form B to C (or C to B). > I think this problem may be caused by the para connectimeout (so little that > lead fail?). Because my nodes deployed on internet so delay is bigger. > who can help me attack this problem and how to set the connectimeout in > openmpi? > > > > > 网易163/126邮箱百分百兼容iphone ipad邮件收发 > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users