Hello Sofia,

After talking with another OMPI member can you humor me and do
"/sbin/iptables -L" on both your machines.  You'll need to be root to
do such.

--td


List-Post: users@lists.open-mpi.org
Date: Tue, 23 Sep 2008 06:02:30 -0400
From: Terry Dontje <terry.don...@sun.com>
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
To: us...@open-mpi.org
Message-ID: <48d8beb6.8040...@sun.com>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1

Hello Sofia, Looking at your stack trace it is what I thought was happening and that is one process is stuck trying to connect to the other. The stack unfortunately does not give enough information as to why. The only suggestion I could give is walk through a debuggable version of the code from ompi_init_do_preconnect and see if you can find where the process is calling connect and see if the connect call is failing. If you don't have a firewall I am not sure what is then blocking the connection from happening. Either the address somehow is being mashed up or something else. --td Date: Mon, 22 Sep 2008 10:49:41 +0200 From: "Sofia Aparicio Secanellas" <sapari...@grpss.ssr.upm.es> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv To: "Open MPI Users" <us...@open-mpi.org> Message-ID: <2F607CC2B43A422B80CEBBD540BFFE8B@aparicio1> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Hello Terry, I do not have an active firewall. I have typed on both computers: netstat -lnut I enclose you the results. I have also written on both computers: mpirun -np 2 --host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1 ./PruebaSumaParalela.out I enclose you the results. Thank you. Sofia ----- Original Message ----- From: "Terry Dontje" <terry.don...@sun.com> To: <us...@open-mpi.org> Sent: Friday, September 19, 2008 7:54 PM Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv


> > Hello Sofia,
> >
> > After further reflection I wonder if you have a firewall that is > > preventing connections to certain ports.
> >
> > --td
> >
> > Terry Dontje wrote:
>
>> >> Hello Sofia,
>> >>
>> >> Ok, so I really wanted the stack of when you run with "-mca >> >> mpi_preconnect_all 1" I believe you'll see that one of the processes >> >> will be in init. However, the stack still probably will not help me help >> >> you. What needs to happen is to step through the code in dbx while the >> >> connection is trying to be established. I am hoping you might find the >> >> connect call fails or that we've been given an interface that somehow >> >> cannot reach the other node. However, when you specified "-mca >> >> btl_tcp_if_include eth1" that should have forced things to use the >> >> interface you need. So it really comes down to why are we not connecting >> >> to the eth1 address? Are we failing on routing to that address or is the >> >> connect failing because we are trying to use a port that we are not >> >> really allowed to use or is it something else?
>> >>
>> >> I don't think it is a routing problem since you are able to reach each >> >> node via ssh. Is there someone else on the list that might want to lend >> >> a hand here? I feel like I am missing something obvious going on here.
>> >>
>> >> --td
>>
>>> >>> Date: Fri, 19 Sep 2008 16:09:11 +0200
>>> >>> From: "Sofia Aparicio Secanellas" <sapari...@grpss.ssr.upm.es>
>>> >>> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
>>> >>> To: "Open MPI Users" <us...@open-mpi.org>
>>> >>> Message-ID: <1BBF50FE29F743B5829CC3785F47CADD@aparicio1>
>>> >>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>> >>>
>>> >>> Hello Terry,
>>> >>>
>>> >>> I have installed 1.2.7 and I obtain the same result.
>>> >>>
>>> >>> I will explain you what I have done.
>>> >>>
>>> >>> 1. On my computer edu@10.1.10.240 I have added a new user called sofia. >>> >>> This way I have sofia@10.1.10.208 and sofia@10.1.10.240. >>> >>> 2. I have downloaded the openmpi 1.2.7 from the openmpi website on both >>> >>> computers in /home/sofia/Desktop. >>> >>> 3. I have installed everything using "sudo ./configure", "sudo make" and >>> >>> "sudo make install". >>> >>> 4. To make ssh not ask me for a password. I have typed in >>> >>> sofia@10.1.10.208 "ssh-keygen -t dsa", "cd $HOME/.ssh" and "cp >>> >>> id_dsa.pub authorized_keys". I have copied the directory >>> >>> "/home/sofia/.ssh" from sofia@10.1.10.208 to /home/sofia/.ssh in >>> >>> sofia@10.1.10.240. The ssh command without password works on computer >>> >>> sofia@10.1.10.208 but computer sofia@10.1.10.208 ask me for a >>> >>> passphrase and for the password. Is it normal? >>> >>> 5. I have created a directory "/home/sofia/programasparalelos" on both >>> >>> computers and I have given permissions to the directory with "chmod >>> >>> 777". >>> >>> 6. I have copied on both computers in "/home/sofia/programasparalelos" >>> >>> the program "PruebaSumaParalela.c" (I have changed a little bit the >>> >>> program, I enclose you the new program) and I have compiled using "mpicc >>> >>> PruebaSumaParalela.c -o PruebaSumaParalela.out".
>>> >>>
>>> >>> 7. Now I run the program on both computersusing the command:
>>> >>>
>>> >>> mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix /usr/local >>> >>> ./PruebaSumaParalela.out
>>> >>>
>>> >>> When I run the program I obtain 3 PIDs executing on every computer, 2 >>> >>> of "./PruebaSumaParalela.out" and 1 of "mpirun -np2 --host >>> >>> 10.1.10.208,10.1.10.240 --prefix /usr/local ./PruebaSumaParalela.out". I >>> >>> enclose you the results obtained on every computer for every >>> >>> "./PruebaSumaParalela.out".
>>> >>>
>>> >>> Thank you very much.
>>> >>>
>>> >>> Sofia
>>> >>>
>>>
>> >>
>> >>
>> > >

Reply via email to