It could also have been that you didn't have exactly matching
installations on both machines. Even if they were the same version,
if they weren't configured / installed the same way on both machines,
this could have led to problems. Also be sure that either the MPI
application is compatible / runnable on both systems or you have a
separate MPI application binary for each system (e.g., to account for
glibc and other differences between your two OS's).
Running in heterogeneous situations like that is quite difficult to
do, and not for the meek. :-)
On Jun 13, 2008, at 2:12 AM, Manuel Freiberger wrote:
Hello,
Well, actually I'm quite sure that it was not the firewall because I
had to
turn it off as otherwise no connection could be established. So my
iptables --list
returns
Chain INPUT (policy ACCEPT)
target prot opt source destination
Chain FORWARD (policy ACCEPT)
target prot opt source destination
Chain OUTPUT (policy ACCEPT)
target prot opt source destination
on both machines. After reinstalling OMPI, I did not make any
changes to the
firewall but now it works without problems. Probably installing the
library
with exactly the same configuration (same --prefix and so on) did
the trick.
But nonetheless, thank you very much for your hint! :-)
Best regards,
Manuel
On Thursday 12 June 2008 18:23, Rainer Keller wrote:
Hi,
are You sure it was not a Firewall issue on the Suse 10.2?
If there are any connections from the Gentoo machine trying to
access the
orted on the Suse, check in /var/log/firewall.
For the time being, try stopping the firewall by (as root) with
/etc/init.d/SuSEfirewall2_setup stop
and test whether it works ,-]
With best regards,
Rainer
On Donnerstag, 12. Juni 2008, Manuel Freiberger wrote:
Hi!
Ok, I found the problem. I reinstallen OMPI on both PCs but this
time
only locally in the users home directory. Now, the sample code works
perfectly. I'm not sure where the error really was located. It
could be
that it was a problem with the Gentoo installation because OMPI is
still
marked unstable in portage (~x86 keyword).
Best regards,
Manuel
On Wednesday 11 June 2008 18:52, Manuel Freiberger wrote:
Hello everybody!
First of all I wanted to point out that I'm beginner regarding
openMPI
and all I try to achieve is to get a simple program working on
two PCs.
So far I've installed openMPI 1.2.6 on two PCs (one running
OpenSUSE
10.2, the other one Gentoo).
I set up two identical users on both systems and made sure that I
can
make an SSH connection between them using private/public key
authentication.
Next I ran the command
mpirun -np 2 --hostfile myhosts uptime
which gave the result
6:41pm up 1 day 5:16, 4 users, load average: 0.00, 0.07, 0.17
18:43:45 up 7:36, 6 users, load average: 0.00, 0.02, 0.05
so I concluded that MPI should work in principle.
Next I tried the following code which I copied from Boost.MPI:
---- snip
#include <mpi.h>
#include <iostream>
int main(int argc, char* argv[])
{
MPI_Init(&argc, &argv);
int rank;
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
if (rank == 0)
{
std::cout << "Rank 0 is going to send" << std::endl;
int value = 17;
int result = MPI_Send(&value, 1, MPI_INT, 1, 0, MPI_COMM_WORLD);
if (result == MPI_SUCCESS)
std::cout << "Rank 0 OK!" << std::endl;
}
else if (rank == 1)
{
std::cout << "Rank 1 is waiting for answer" << std::endl;
int value;
MPI_Status status;
int result = MPI_Recv(&value, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
&status);
if (result == MPI_SUCCESS && value == 17)
std::cout << "Rank 1 OK!" << std::endl;
}
MPI_Finalize();
return 0;
}
---- snap
Starting a parallel job using
mpirun -np 2 --hostfile myhosts mpi-test
I get the answer
Rank 0 is going to send
Rank 1 is waiting for answer
Rank 0 OK!
and than the program locks. So the strange thing is that
obviously the
recv()-command is blocking, which is what I do not understand.
Could anybody provide some hints, where I should start looking
for the
mistake? Any help is welcome!
Best regards,
Manuel
--
Manuel Freiberger
Institute of Medical Engineering
Graz University of Technology
manuel.freiber...@tugraz.at
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems