Hello, Still i am facing problems. I checked there is no firewall which is acting as a barrier for the mpi communication.
even i used the execution line like hsaeed@karp:~/Task4_mpi/scatterv$ mpiexec -n 2 --mca btl_tcp_if_exclude br2 -host wirth,karp ./a.out Now the output hangup without displaying any error. Used "..exclude bt2" because the failed connection was from bt2 as you can see in the "ifconfig" output mentioned above. I know there is something wrong but i am almost unable to figure it out. I need some more kind suggestions. regards. On Fri, Mar 21, 2014 at 6:05 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > wrote: > Do you have any firewalling enabled on these machines? If so, you'll want > to either disable it, or allow random TCP connections between any of the > cluster nodes. > > > On Mar 21, 2014, at 10:24 AM, Hamid Saeed <e.hamidsa...@gmail.com> wrote: > > > /sbin/ifconfig > > > > hsaeed@karp:~$ /sbin/ifconfig > > br0 Link encap:Ethernet HWaddr 00:25:90:59:c9:ba > > inet addr:134.106.3.231 Bcast:134.106.3.255 > Mask:255.255.255.0 > > inet6 addr: fe80::225:90ff:fe59:c9ba/64 Scope:Link > > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > RX packets:49080961 errors:0 dropped:50263 overruns:0 frame:0 > > TX packets:43279252 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:0 > > RX bytes:41348407558 (38.5 GiB) TX bytes:80505842745 (74.9 > GiB) > > > > br1 Link encap:Ethernet HWaddr 00:25:90:59:c9:bb > > inet addr:134.106.53.231 Bcast:134.106.53.255 > Mask:255.255.255.0 > > inet6 addr: fe80::225:90ff:fe59:c9bb/64 Scope:Link > > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > RX packets:41573060 errors:0 dropped:50261 overruns:0 frame:0 > > TX packets:1693509 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:0 > > RX bytes:6177072160 (5.7 GiB) TX bytes:230617435 (219.9 MiB) > > > > br2 Link encap:Ethernet HWaddr 00:c0:0a:ec:02:e7 > > inet addr:10.231.2.231 Bcast:10.231.2.255 Mask:255.255.255.0 > > UP BROADCAST MULTICAST MTU:1500 Metric:1 > > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:0 > > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) > > > > eth0 Link encap:Ethernet HWaddr 00:25:90:59:c9:ba > > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > RX packets:69108377 errors:0 dropped:0 overruns:0 frame:0 > > TX packets:86459066 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:1000 > > RX bytes:43533091399 (40.5 GiB) TX bytes:83359370885 (77.6 > GiB) > > Memory:dfe60000-dfe80000 > > > > eth1 Link encap:Ethernet HWaddr 00:25:90:59:c9:bb > > UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 > > RX packets:43531546 errors:0 dropped:0 overruns:0 frame:0 > > TX packets:1716151 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:1000 > > RX bytes:7201915977 (6.7 GiB) TX bytes:232026383 (221.2 MiB) > > Memory:dfee0000-dff00000 > > > > lo Link encap:Local Loopback > > inet addr:127.0.0.1 Mask:255.0.0.0 > > inet6 addr: ::1/128 Scope:Host > > UP LOOPBACK RUNNING MTU:16436 Metric:1 > > RX packets:10890707 errors:0 dropped:0 overruns:0 frame:0 > > TX packets:10890707 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:0 > > RX bytes:36194379576 (33.7 GiB) TX bytes:36194379576 (33.7 > GiB) > > > > tap0 Link encap:Ethernet HWaddr 00:c0:0a:ec:02:e7 > > UP BROADCAST MULTICAST MTU:1500 Metric:1 > > RX packets:0 errors:0 dropped:0 overruns:0 frame:0 > > TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 > > collisions:0 txqueuelen:500 > > RX bytes:0 (0.0 B) TX bytes:0 (0.0 B) > > > > When i execute the following line > > > > hsaeed@karp:~/Task4_mpi/scatterv$ mpiexec -n 2 -host wirth,karp ./a.out > > > > i receive Error > > > > > [wirth][[59430,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect] > connect() to 10.231.2.231 failed: Connection refused (111) > > > > > > NOTE: Karp and wirth are two machines on ssh cluster. > > > > > > > > > > On Fri, Mar 21, 2014 at 3:13 PM, Jeff Squyres (jsquyres) < > jsquy...@cisco.com> wrote: > > On Mar 21, 2014, at 10:09 AM, Hamid Saeed <e.hamidsa...@gmail.com> > wrote: > > > > > > I think i have a tcp connection. As for as i know my cluster is not > configured for Infiniband (IB). > > > > Ok. > > > > > > but even for tcp connections. > > > > > > > > mpirun -n 2 -host master,node001 --mca btl tcp,sm,self > ./helloworldmpi > > > > mpirun -n 2 -host master,node001 ./helloworldmpi > > > > > > > > These line are not working they output > > > > Error like > > > > [btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_connect] > connect() to xx.xxx.x.xxx failed: Connection refused (111) > > > > What are the IP addresses reported by connect()? (i.e., the address you > X'ed out) > > > > Send the output from ifconfig on each of your servers. Note that some > Linux distributions do not put ifconfig in the default PATH of normal > users; look for it in/sbin/ifconfig or /usr/sbin/ifconfig. > > > > -- > > Jeff Squyres > > jsquy...@cisco.com > > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > > > > > -- > > _______________________________________________ > > Hamid Saeed > > CoSynth GmbH & Co. KG > > Escherweg 2 - 26121 Oldenburg - Germany > > Tel +49 441 9722 738 | Fax -278 > > http://www.cosynth.com > > _______________________________________________ > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- _______________________________________________ Hamid Saeed CoSynth GmbH & Co. KG Escherweg 2 - 26121 Oldenburg - Germany Tel +49 441 9722 738 | Fax -278 http://www.cosynth.com _______________________________________________