Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, Thanks i figured out what was the exact problem in my case. Now i am using the following execution line. it is directing the mpi comm port to start from 1... mpiexec -n 2 --host karp,wirth --mca btl ^openib --mca btl_tcp_if_include br0 --mca btl_tcp_port_min_v4 1 ./a.out and every

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, I am not sure what approach does the MPI communication follow but when i use --mca btl_base_verbose 30 I observe the mentioned port. [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on port 4 [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_con

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Reuti
Hi, Am 25.03.2014 um 08:34 schrieb Hamid Saeed: > Is it possible to change the port number for the MPI communication? > > I can see that my program uses port 4 for the MPI communication. > > [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on > port 4 > [karp][[4612,1],0

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-25 Thread Hamid Saeed
Hello, Is it possible to change the port number for the MPI communication? I can see that my program uses port 4 for the MPI communication. [karp:23756] btl: tcp: attempting to connect() to address 134.106.3.252 on port 4 [karp][[4612,1],0][btl_tcp_endpoint.c:655:mca_btl_tcp_endpoint_complete_co

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Hamid Saeed
Hello Jeff, Thanks for your cooperation. --mca btl_tcp_if_include br0 worked out of the box. The problem was from the network administrator. The machines on the network side were halting the mpi... so cleaning and killing every thing worked. :) regards. On Mon, Mar 24, 2014 at 4:34 PM, Jef

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Jeff Squyres (jsquyres)
There is no "self" IP interface in the Linux kernel. Try using btl_tcp_if_include and list just the interface(s) that you want to use. From your prior email, I'm *guessing* it's just br2 (i.e., the 10.x address inside your cluster). Also, it looks like you didn't setup your SSH keys properly f

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Hamid Saeed
Hello, I added the "self" e.g hsaeed@karp:~/Task4_mpi/scatterv$ mpirun -np 8 --mca btl ^openib --mca btl_tcp_if_exclude sm,self,lo,br0,br1,ib0,br2 --host karp,wirth ./scatterv Enter passphrase for key '/home/hsaeed/.ssh/id_rsa': ---

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Jeff Squyres (jsquyres)
If you you use btl_tcp_if_exclude, you also need to exclude the loopback interface. Loopback is excluded by the default value of btl_tcp_if_exclude, but if you overwrite that value, then you need to *also* include the loopback interface in the new value. On Mar 24, 2014, at 4:57 AM, Hamid Sa

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-24 Thread Hamid Saeed
Hello, Still i am facing problems. I checked there is no firewall which is acting as a barrier for the mpi communication. even i used the execution line like hsaeed@karp:~/Task4_mpi/scatterv$ mpiexec -n 2 --mca btl_tcp_if_exclude br2 -host wirth,karp ./a.out Now the output hangup without displayi

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-21 Thread Jeff Squyres (jsquyres)
Do you have any firewalling enabled on these machines? If so, you'll want to either disable it, or allow random TCP connections between any of the cluster nodes. On Mar 21, 2014, at 10:24 AM, Hamid Saeed wrote: > /sbin/ifconfig > > hsaeed@karp:~$ /sbin/ifconfig > br0 Link encap:Ethern

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-21 Thread Hamid Saeed
/sbin/ifconfig hsaeed@karp:~$ /sbin/ifconfig br0 Link encap:Ethernet HWaddr 00:25:90:59:c9:ba inet addr:134.106.3.231 Bcast:134.106.3.255 Mask:255.255.255.0 inet6 addr: fe80::225:90ff:fe59:c9ba/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1

Re: [OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-21 Thread Jeff Squyres (jsquyres)
On Mar 21, 2014, at 10:09 AM, Hamid Saeed wrote: > > I think i have a tcp connection. As for as i know my cluster is not > > configured for Infiniband (IB). Ok. > > but even for tcp connections. > > > > mpirun -n 2 -host master,node001 --mca btl tcp,sm,self ./helloworldmpi > > mpirun -n 2 -hos

[OMPI users] Fwd: problem for multiple clusters using mpirun

2014-03-21 Thread Hamid Saeed
-- Forwarded message -- From: Jeff Squyres (jsquyres) List-Post: users@lists.open-mpi.org Date: Fri, Mar 21, 2014 at 3:05 PM Subject: Re: problem for multiple clusters using mpirun To: Hamid Saeed Please reply on the mailing list; more people can reply that way, and the answers