Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Ralph Castain
The FAQ assumes you realize that CIDR notation requires a value for the "x"... :-) On May 7, 2013, at 9:04 AM, Angel de Vicente wrote: > Hi, > > "Jeff Squyres (jsquyres)" writes: >> The list of names in the hostfile specifies the servers that will be used, >> not the network interfaces. Ha

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Jeff Squyres (jsquyres)
On May 7, 2013, at 12:04 PM, Angel de Vicente wrote: > But, the FAQ seems to be wrong, since it also says that I should be able > to run like: > > [angelv@comer RTI2D.Parallel]$ mpiexec -loadbalance --mca > btl_tcp_if_include 192.168.1.x/24 -prefix $OMPI_PREFIX -hostfile I think I meant the "x

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente
Hi, "Jeff Squyres (jsquyres)" writes: > The list of names in the hostfile specifies the servers that will be used, > not the network interfaces. Have a look at the TCP portion of the FAQ: > > http://www.open-mpi.org/faq/?category=tcp Thanks a lot for this. Now it works OK if I run it lik

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Ralph Castain
Look at the MCA params btl_tcp_if_include and btl_tcp_if_exclude. Either include the internal network (to restrict to *only* using that one), or exclude the public one On May 7, 2013, at 8:25 AM, Angel de Vicente wrote: > Hi again, > > Angel de Vicente writes: >> yes, that's just what I did

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Jeff Squyres (jsquyres)
The list of names in the hostfile specifies the servers that will be used, not the network interfaces. Have a look at the TCP portion of the FAQ: http://www.open-mpi.org/faq/?category=tcp On May 7, 2013, at 11:25 AM, Angel de Vicente wrote: > Hi again, > > Angel de Vicente writes: >> y

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente
Hi again, Angel de Vicente writes: > yes, that's just what I did with orted. I saw the port that it was > trying to connect and telnet to it, and I got "No route to host", so > that's why I was going the firewall path. Hopefully the sysadmins can > disable the firewall for the internal network to

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-07 Thread Angel de Vicente
Hi, "Jeff Squyres (jsquyres)" writes: >>> I'm starting to think that perhaps is a firewall issue? I don't have >>> root access in these machines but I'll try to investigate. > A simple test is to try any socket-based server app between the two > machines that opens a random listening socket. Tr

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-06 Thread Jeff Squyres (jsquyres)
On May 6, 2013, at 8:52 AM, Ralph Castain wrote: >> I'm starting to think that perhaps is a firewall issue? I don't have >> root access in these machines but I'll try to investigate. > > Given that result, then yes - check iptables. I suspect they are running and > TCP socket comm is being bloc

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-06 Thread Ralph Castain
On May 6, 2013, at 2:10 AM, Angel de Vicente wrote: > Hi, > > Ralph Castain writes: > >> On May 4, 2013, at 4:54 PM, Angel de Vicente wrote: >>> >>> Is there any way to dump details of what OpenMPI is trying to do in each >>> node, so I can see if it is looking for different libraries in ea

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-06 Thread Angel de Vicente
Hi, Ralph Castain writes: > On May 4, 2013, at 4:54 PM, Angel de Vicente wrote: >> >> Is there any way to dump details of what OpenMPI is trying to do in each >> node, so I can see if it is looking for different libraries in each >> node, or something similar? thanks for the suggestions, but

Re: [OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-04 Thread Ralph Castain
On May 4, 2013, at 4:54 PM, Angel de Vicente wrote: > Hi, > > I have used OpenMPI before without any troubles, and configured MPICH, > MPICH2 and OpenMPI in many different machines before, but recently we > upgraded the OS to Fedora 17, and now I'm having trouble running an MPI > code in two of

[OMPI users] Help diagnosing problem: not being able to run MPI code across computers

2013-05-04 Thread Angel de Vicente
Hi, I have used OpenMPI before without any troubles, and configured MPICH, MPICH2 and OpenMPI in many different machines before, but recently we upgraded the OS to Fedora 17, and now I'm having trouble running an MPI code in two of our machines connected via a switch. I thought perhaps the old in