Hello Community/Ralph I was told by the sysadmin that the firewall does not prevent communication between two machines (tik33x, tik34x) for instance. However, it will only block if OpenMPI is trying to open TCP/UDP ports lower than 1024, which require privileges.
Is it possible to know which port numbers does OpenMPI use? Specifically, is it possible to specify port numbers that OpenMPI must not use (OpenMPI-1.4.x)? Here is the reply I got from my sysadmin: "There is a firewall, but it does not block internal traffic within the whole TIK network (I verified it for myself). Thus, the connection problem must be somewhere else (a service not running or binding to the wrong interface for instance). Maybe the service wants to bind to a tcp or udp port lower than 1024, which can only be allocated by the system's superuser. First, check on which ports and on which network card interfaces the software listens and if it is configured correctly so that it will listen at all." Is there a way out? Thanks a lot Devendra ________________________________ From: Ralph Castain <r...@open-mpi.org> To: devendra rai <rai.deven...@yahoo.co.uk>; Open MPI Users <us...@open-mpi.org> Sent: Wednesday, 16 May 2012, 15:09 Subject: Re: [OMPI users] Returned "Unreachable" (-12) instead of "Success" (0) Looks like you have a firewall between hosts tik34x and tik33x - you might check to ensure all firewalls are disabled. The error is saying it can't open a TCP socket between the two nodes, so there is no communication path between those two processes. On May 16, 2012, at 4:22 AM, devendra rai wrote: Hello All, > > >I am trying to run an OpenMPI application across two physical machines. > >I get an error "Returned "Unreachable" (-12) instead of "Success" (0)", and >looking through the logs (attached), I cannot seem to find out the cause, and >how to fix it. > >I see lot of (communication) components being loaded and then unloaded, and I >do not see which nodes pick up what kind of comm-interface. > >-------------------------------------------------------------------------- >At least one pair of MPI processes are unable to reach each other for >MPI communications. This means that no Open MPI device has indicated >that it can be used to communicate between these processes. This is >an error; Open MPI requires that all MPI processes be able to reach >each other. This error can sometimes be the result of forgetting to >specify the "self" BTL. > > Process 1 ([[10782,1],6]) is on host: tik34x > Process 2 ([[10782,1],0]) is on host: tik33x > BTLs attempted: self sm tcp > >Your MPI job is now going to abort; sorry. > >The "mpirun" line is: > >mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -report-pid >-display-map -report-bindings -hostfile hostfile -np 7 -v --rankfile >rankfile.txt -v --timestamp-output --tag-output ./xstartwrapper.sh >./run_gdb.sh > >where the .sh files are fixes for forwarding X-windows from multiple machines >to the machines where I am logged in. > >Can anyone help? > >Thanks a lot. > >Best, > >Devendra > > > > >From: devendra rai <dev...@yahoo.co.uk> > >Subject: Returned "Unreachable" (-12) instead of "Success" (0) > >List-Post: users@lists.open-mpi.org Date: May 16, 2012 4:18:28 AM MDT > >To: Open MPI Users <us...@open-mpi.org> > >Reply-To: devendra rai <rai.deven...@yahoo.co.uk> > > > > >Hello All, > >I am trying to run an OpenMPI application across two physical machines. > >I get an error "Returned "Unreachable" (-12) instead of "Success" (0)", and >looking through the logs (attached), I cannot seem to find out the cause, and >how to fix it. > >I see lot of (communication) components being loaded and then unloaded, and I >do not see which nodes pick up what kind of comm-interface. > >-------------------------------------------------------------------------- >At least one pair of MPI processes are unable to reach each other for >MPI communications. This means that no Open MPI device has indicated >that it can be used to communicate between these processes. This is >an error; Open MPI requires that all MPI processes be able to reach >each other. This error can sometimes be the result of forgetting to >specify the "self" BTL. > > Process 1 ([[10782,1],6]) is on host: tik34x > Process 2 ([[10782,1],0]) is on host: tik33x > BTLs attempted: self sm tcp > >Your MPI job is now going to abort; sorry. > >The "mpirun" line is: > >mpirun --mca btl self,sm,tcp --mca btl_base_verbose 30 -report-pid >-display-map -report-bindings -hostfile hostfile -np 7 -v --rankfile >rankfile.txt -v --timestamp-output --tag-output ./xstartwrapper.sh >./run_gdb.sh > >where the .sh files are fixes for forwarding X-windows from multiple machines >to the machines where I am logged in. > >Can anyone help? > >Thanks a lot. > >Best, > >Devendra > <MPILog.txt> > ><MPILog.txt>_______________________________________________ >users mailing list >us...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/users