You mentioned running this in a VM - is that IP address correct for getting across the VMs?
> On Mar 28, 2015, at 8:38 AM, LOTFIFAR F. <foad.lotfi...@durham.ac.uk> wrote: > > Hi , > > I am wondering how can I solve this problem. > System Spec: > 1- Linux cluster with two nodes (master and slave) with Ubuntu 12.04 LTS > 32bit. > 2- openmpi 1.8.4 > > I do a simple test running on fehg_node_0: > > mpirun -host fehg_node_0,fehg_node_1 hello_world -mca oob_base_verbose 20 > > and I get the following error: > > A process or daemon was unable to complete a TCP connection > to another process: > Local host: fehg-node-0 > Remote host: 10.104.5.40 > This is usually caused by a firewall on the remote host. Please > check that any firewall (e.g., iptables) has been disabled and > try again. > ------------------------------------------------------------ > -------------------------------------------------------------------------- > ORTE was unable to reliably start one or more daemons. > This usually is caused by: > > * not finding the required libraries and/or binaries on > one or more nodes. Please check your PATH and LD_LIBRARY_PATH > settings, or configure OMPI with --enable-orterun-prefix-by-default > > * lack of authority to execute on one or more specified nodes. > Please verify your allocation and authorities. > > * the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base). > Please check with your sys admin to determine the correct location to use. > > * compilation of the orted with dynamic libraries when static are required > (e.g., on Cray). Please check your configure cmd line and consider using > one of the contrib/platform definitions for your system type. > > * an inability to create a connection back to mpirun due to a > lack of common network interfaces and/or no route found between > them. Please check network connectivity (including firewalls > and network routing requirements). > > Verbose: > 1- I have full access to the VMs on the cluster and setup everything myself > 2- Firewall and iptables are all disabled on the nodes > 3- nodes can ssh to each other with no problem > 4- non-interactive bash calls works fine i.e. when I run ssh othernode env | > grep PATH from both nodes, both PATH and LD_LIBRARY_PATH are set correctly > 5- I have checked the posts, a similar problem reported for Solaris but I > could not find a clue about mine. > 6- run with --enable-orterun-prefix-by-default does not make any changes. > 7- I see orte is running on the other node when I check processes, but > nothing happens after that and the error happens. > > Regards, > Karos > _______________________________________________ > users mailing list > us...@open-mpi.org <mailto:us...@open-mpi.org> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > <http://www.open-mpi.org/mailman/listinfo.cgi/users> > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/03/26555.php > <http://www.open-mpi.org/community/lists/users/2015/03/26555.php>