Hello Sofia,
After talking with another OMPI member can you humor me and do
"/sbin/iptables -L" on both your machines. You'll need to be root to
do such.
--td
List-Post: users@lists.open-mpi.org
Date: Tue, 23 Sep 2008 06:02:30 -0400
From: Terry Dontje <terry.don...@sun.com>
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
To: us...@open-mpi.org
Message-ID: <48d8beb6.8040...@sun.com>
Content-Type: text/plain; format=flowed; charset=ISO-8859-1
Hello Sofia, Looking at your stack trace it is what I thought was
happening and that is one process is stuck trying to connect to the
other. The stack unfortunately does not give enough information as to
why. The only suggestion I could give is walk through a debuggable
version of the code from ompi_init_do_preconnect and see if you can find
where the process is calling connect and see if the connect call is
failing. If you don't have a firewall I am not sure what is then
blocking the connection from happening. Either the address somehow is
being mashed up or something else. --td Date: Mon, 22 Sep 2008 10:49:41
+0200 From: "Sofia Aparicio Secanellas" <sapari...@grpss.ssr.upm.es>
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv To: "Open
MPI Users" <us...@open-mpi.org> Message-ID:
<2F607CC2B43A422B80CEBBD540BFFE8B@aparicio1> Content-Type: text/plain;
charset="iso-8859-1"; Format="flowed" Hello Terry, I do not have an
active firewall. I have typed on both computers: netstat -lnut I enclose
you the results. I have also written on both computers: mpirun -np 2
--host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix
/usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1
./PruebaSumaParalela.out I enclose you the results. Thank you. Sofia
----- Original Message ----- From: "Terry Dontje" <terry.don...@sun.com>
To: <us...@open-mpi.org> Sent: Friday, September 19, 2008 7:54 PM
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
> > Hello Sofia,
> >
> > After further reflection I wonder if you have a firewall that is
> > preventing connections to certain ports.
> >
> > --td
> >
> > Terry Dontje wrote:
>
>> >> Hello Sofia,
>> >>
>> >> Ok, so I really wanted the stack of when you run with "-mca
>> >> mpi_preconnect_all 1" I believe you'll see that one of the processes
>> >> will be in init. However, the stack still probably will not help me help
>> >> you. What needs to happen is to step through the code in dbx while the
>> >> connection is trying to be established. I am hoping you might find the
>> >> connect call fails or that we've been given an interface that somehow
>> >> cannot reach the other node. However, when you specified "-mca
>> >> btl_tcp_if_include eth1" that should have forced things to use the
>> >> interface you need. So it really comes down to why are we not connecting
>> >> to the eth1 address? Are we failing on routing to that address or is the
>> >> connect failing because we are trying to use a port that we are not
>> >> really allowed to use or is it something else?
>> >>
>> >> I don't think it is a routing problem since you are able to reach each
>> >> node via ssh. Is there someone else on the list that might want to lend
>> >> a hand here? I feel like I am missing something obvious going on here.
>> >>
>> >> --td
>>
>>> >>> Date: Fri, 19 Sep 2008 16:09:11 +0200
>>> >>> From: "Sofia Aparicio Secanellas" <sapari...@grpss.ssr.upm.es>
>>> >>> Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
>>> >>> To: "Open MPI Users" <us...@open-mpi.org>
>>> >>> Message-ID: <1BBF50FE29F743B5829CC3785F47CADD@aparicio1>
>>> >>> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>>> >>>
>>> >>> Hello Terry,
>>> >>>
>>> >>> I have installed 1.2.7 and I obtain the same result.
>>> >>>
>>> >>> I will explain you what I have done.
>>> >>>
>>> >>> 1. On my computer edu@10.1.10.240 I have added a new user called sofia.
>>> >>> This way I have sofia@10.1.10.208 and sofia@10.1.10.240.
>>> >>> 2. I have downloaded the openmpi 1.2.7 from the openmpi website on both
>>> >>> computers in /home/sofia/Desktop.
>>> >>> 3. I have installed everything using "sudo ./configure", "sudo make" and
>>> >>> "sudo make install".
>>> >>> 4. To make ssh not ask me for a password. I have typed in
>>> >>> sofia@10.1.10.208 "ssh-keygen -t dsa", "cd $HOME/.ssh" and "cp
>>> >>> id_dsa.pub authorized_keys". I have copied the directory
>>> >>> "/home/sofia/.ssh" from sofia@10.1.10.208 to /home/sofia/.ssh in
>>> >>> sofia@10.1.10.240. The ssh command without password works on computer
>>> >>> sofia@10.1.10.208 but computer sofia@10.1.10.208 ask me for a
>>> >>> passphrase and for the password. Is it normal?
>>> >>> 5. I have created a directory "/home/sofia/programasparalelos" on both
>>> >>> computers and I have given permissions to the directory with "chmod
>>> >>> 777".
>>> >>> 6. I have copied on both computers in "/home/sofia/programasparalelos"
>>> >>> the program "PruebaSumaParalela.c" (I have changed a little bit the
>>> >>> program, I enclose you the new program) and I have compiled using "mpicc
>>> >>> PruebaSumaParalela.c -o PruebaSumaParalela.out".
>>> >>>
>>> >>> 7. Now I run the program on both computersusing the command:
>>> >>>
>>> >>> mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix /usr/local
>>> >>> ./PruebaSumaParalela.out
>>> >>>
>>> >>> When I run the program I obtain 3 PIDs executing on every computer, 2
>>> >>> of "./PruebaSumaParalela.out" and 1 of "mpirun -np2 --host
>>> >>> 10.1.10.208,10.1.10.240 --prefix /usr/local ./PruebaSumaParalela.out". I
>>> >>> enclose you the results obtained on every computer for every
>>> >>> "./PruebaSumaParalela.out".
>>> >>>
>>> >>> Thank you very much.
>>> >>>
>>> >>> Sofia
>>> >>>
>>>
>> >>
>> >>
>>
> >