Hello Terry,

I have installed 1.2.7 and I obtain the same result.

I will explain you what I have done.

1. On my computer edu@10.1.10.240 I have added a new user called sofia. This way I have sofia@10.1.10.208 and sofia@10.1.10.240. 2. I have downloaded the openmpi 1.2.7 from the openmpi website on both computers in /home/sofia/Desktop. 3. I have installed everything using "sudo ./configure", "sudo make" and "sudo make install". 4. To make ssh not ask me for a password. I have typed in sofia@10.1.10.208 "ssh-keygen -t dsa", "cd $HOME/.ssh" and "cp id_dsa.pub authorized_keys". I have copied the directory "/home/sofia/.ssh" from sofia@10.1.10.208 to /home/sofia/.ssh in sofia@10.1.10.240. The ssh command without password works on computer sofia@10.1.10.208 but computer sofia@10.1.10.208 ask me for a passphrase and for the password. Is it normal? 5. I have created a directory "/home/sofia/programasparalelos" on both computers and I have given permissions to the directory with "chmod 777". 6. I have copied on both computers in "/home/sofia/programasparalelos" the program "PruebaSumaParalela.c" (I have changed a little bit the program, I enclose you the new program) and I have compiled using "mpicc PruebaSumaParalela.c -o PruebaSumaParalela.out".

7. Now I run the program on both computersusing the command:

mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix /usr/local ./PruebaSumaParalela.out

When I run the program I obtain 3 PIDs executing on every computer, 2 of "./PruebaSumaParalela.out" and 1 of "mpirun -np2 --host 10.1.10.208,10.1.10.240 --prefix /usr/local ./PruebaSumaParalela.out". I enclose you the results obtained on every computer for every "./PruebaSumaParalela.out".

Thank you very much.

Sofia


----- Original Message ----- From: "Terry Dontje" <terry.don...@sun.com>
To: <us...@open-mpi.org>
Sent: Thursday, September 18, 2008 7:31 PM
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv


Turns out you debugged the mpirun I was actually wanting you to attach to your program, PruebaSumaParalela.out, on both nodes and dump each of their stacks. Is there a reason why you are using 1.2.2 instead of 1.2.7 or something from the 1.3 branch? I am wondering if maybe there is some sort of bug in the tcp BTL that is preventing it from matching your two interfaces.

Another thing to try is to specifically list out the interfaces you want to have used. I do not think this is going to help but it can't hurt either. I would do something like:

mpirun -np 2 --host 10.4.5.123,edu@10.4.5.126 --mca mpi_preconnect_all 1 --prefix /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1 ./PruebaSumaParalela.out


--td

Date: Thu, 18 Sep 2008 13:12:46 +0200
From: "Sofia Aparicio Secanellas" <sapari...@grpss.ssr.upm.es>
Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv
To: "Open MPI Users" <us...@open-mpi.org>
Message-ID: <35BA42AA514D45239323DB9B38B4C5CE@aparicio1>
Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"

Hello Terry,

Finally, I have installed dbx. I enclose a file with the result that I get when I type "dbx - PID of mpirun..." and then "where" on computer 10.4.5.123 .

Do you have any idea what could be the problem?

Thanks a lot!!

Sofia



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



No virus found in this incoming message
Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007).
http://www.pctools.com/free-antivirus/

No virus found in this outgoing message
Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007).
http://www.pctools.com/free-antivirus/

#include<stdio.h>
#include<mpi.h>
int main(int argc, char ** argv){
int mynode, totalnodes;
int sum,startval,endval,accum;
printf("Inicio\n");
MPI_Status status;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD, &totalnodes);
MPI_Comm_rank(MPI_COMM_WORLD, &mynode);
printf("totalnodes: %d\n",totalnodes);
printf("mynode: %d\n",mynode);
sum = 1110;

if(mynode!=0){
printf("Inicio Send\n");
printf("sum: %d\n",sum);
MPI_Send(&sum,1,MPI_INT,0,1,MPI_COMM_WORLD);
printf("Send sum\n");
}
else
{
printf("Inicio Recv\n");
MPI_Recv(&accum,1,MPI_INT,1,1,MPI_COMM_WORLD, &status);
printf("RECV accum\n");
printf("Sum\n");
}
printf("Final\n");
if(mynode == 0)
printf("The sum from 1 to 1000 is: %d",accum);
MPI_Finalize();
}
current thread: t@3083777712
[1] 0xffffe410(0xb7edb458, 0x0, 0xb7d6fe56, 0xb7eb9a1b, 0xbf84c3a0, 0x0), at 0xffffe410 [2] __gettimeofday(0x8051530, 0x2, 0x0, 0xb7edb458, 0xb7eb9e69), at 0xb7d6fe56 [3] opal_event_loop(0x2, 0xb7c271a8, 0x1a9b860, 0x0, 0x1a9dd78, 0x0), at 0xb7eb9e89 [4] opal_progress(0x820d080, 0x8049c00, 0x1, 0xbf84c4c8, 0xb7fd1408, 0xb7fd1408), at 0xb7eb4606 [5] mca_pml_ob1_recv(0xbf84c4c8, 0x1, 0x8049c00, 0x1, 0x1, 0x8049d40), at 0xb7c1ce56 [6] PMPI_Recv(0xbf84c4c8, 0x1, 0x8049c00, 0x1, 0x1, 0x8049d40), at 0xb7fa8473 [7] main(0xb7feece0, 0x8048950, 0xbf84c558, 0xb7d02050, 0x1, 0xbf84c584), at 0x80488f3
current thread: t@3083495104
[1] 0xffffe410(0x0, 0xbf993888, 0xb7df66d7, 0xbf993894, 0xbf993954, 0x80), at 0xffffe410 [2] __libc_sigaction(0x11, 0xbf993950, 0x0), at 0xb7df66d7 [3] __sigaction(0x11, 0xbf993950, 0x0, 0x0, 0xb7e6cb40, 0x10000), at 0xb7df67c3 [4] opal_evsignal_recalc(0x8051ccc), at 0xb7e6cae5 [5] opal_poll_recalc(0x8051c78, 0x8051cb0, 0x0, 0x0, 0xb7bb9d13, 0x820f5c4), at 0xb7e6da40 [6] opal_event_base_loop(0x8051c78, 0x2, 0x0, 0xb7e8c458, 0xb7e6b0d9), at 0xb7e6ac25 [7] opal_event_loop(0x2, 0xb7be21b0, 0x989680, 0x0, 0x98c212, 0x0), at 0xb7e6b0f9 [8] opal_progress(0x820f180, 0x820d290, 0x4, 0xbf993bac, 0xb7de44c0, 0x0), at 0xb7e65646 [9] mca_pml_ob1_send(0xbf993bac, 0x1, 0x8049bf8, 0x0, 0x1, 0x4), at 0xb7bd87dc [10] PMPI_Send(0xbf993bac, 0x1, 0x8049bf8, 0x0, 0x1, 0x8049d38), at 0xb7f5c754 [11] main(0xb7f9fce0, 0x8048950, 0xbf993c38, 0xb7cbcebc, 0x1, 0xbf993c64), at 0x804889f
current thread: t@3083286192
[1] 0xffffe410(0x0, 0x4, 0xb7d3d5ab, 0x8051584, 0x0, 0xb7e63458), at 0xffffe410 [2] __poll(0x8051628, 0x4, 0x0, 0xb7e63458, 0x8051568, 0x4), at 0xb7d3d5ab [3] opal_poll_dispatch(0x8051530, 0x8051568, 0xbff0eff0, 0xb7ee58b4, 0x80482e8, 0x1), at 0xb7e44866 [4] opal_event_base_loop(0x8051530, 0x2, 0x0, 0xb7e63458, 0xb7e41e69), at 0xb7e41ae9 [5] opal_event_loop(0x2, 0xb7baf1a8, 0x1a9b860, 0x0, 0x1aae470, 0x0), at 0xb7e41e89 [6] opal_progress(0x820d080, 0x8049c00, 0x1, 0xbff0f118, 0xb7f59408, 0xb7f59408), at 0xb7e3c606 [7] mca_pml_ob1_recv(0xbff0f118, 0x1, 0x8049c00, 0x1, 0x1, 0x8049d40), at 0xb7ba4e56 [8] PMPI_Recv(0xbff0f118, 0x1, 0x8049c00, 0x1, 0x1, 0x8049d40), at 0xb7f30473 [9] main(0xb7f76ce0, 0x8048950, 0xbff0f1a8, 0xb7c8a050, 0x1, 0xbff0f1d4), at 0x80488f3
current thread: t@3083810496
[1] 0xffffe410(0x0, 0xbfc31438, 0xb7e436d7, 0xbfc31444, 0xbfc31504, 0x80), at 0xffffe410 [2] __libc_sigaction(0x11, 0xbfc31500, 0x0), at 0xb7e436d7 [3] __sigaction(0x11, 0xbfc31500, 0x0, 0xbfc31520, 0xb7eb9b40, 0x10000), at 0xb7e437c3 [4] opal_evsignal_recalc(0x8051ccc, 0x4, 0x0, 0xb7ed9458, 0x8051cb0, 0x4), at 0xb7eb9ae5 [5] opal_poll_dispatch(0x8051c78, 0x8051cb0, 0xbfc31620, 0x0, 0xb7c06d13, 0x820f5c4), at 0xb7ebaae0 [6] opal_event_base_loop(0x8051c78, 0x2, 0x0, 0xb7ed9458, 0xb7eb80d9), at 0xb7eb7d59 [7] opal_event_loop(0x2, 0xb7c2f1b0, 0x117b190, 0x0, 0x117df53, 0x0), at 0xb7eb80f9 [8] opal_progress(0x820f180, 0x820d290, 0x4, 0xbfc3177c, 0xb7e314c0, 0x0), at 0xb7eb2646 [9] mca_pml_ob1_send(0xbfc3177c, 0x1, 0x8049bf8, 0x0, 0x1, 0x4), at 0xb7c257dc [10] PMPI_Send(0xbfc3177c, 0x1, 0x8049bf8, 0x0, 0x1, 0x8049d38), at 0xb7fa9754 [11] main(0xb7fecce0, 0x8048950, 0xbfc31808, 0xb7d09ebc, 0x1, 0xbfc31834), at 0x804889f

Reply via email to