You probably do want to check with your admin and ensure that either 
firewalling software is disabled or that a trust relationship is setup between 
the machines that you want to use.  Effectively, Open MPI needs to be open 
random TCP ports between all the hosts that you will be using.

There are controls to restrict Open MPI's TCP port selection, but it's 
generally easier if you can just disable firewalling or setup trust between 
machines.


On Mar 24, 2010, at 4:45 PM, haoanyi wrote:

> I run a program with the following command line, and obtain the error message
> mpirun -x LD_LIBRARY_PATH=/home/haoanyi1/socIntel/goto --prefix 
> /home/haoanyi1/openmpi1.4.1 -np 2 -host intel01,intel02  -rf hosts ./main 62 
> 62 tests/ > newtest_64x64_np2_omp
> 
> [btl_tcp_endpoint.c:638:mca_btl_tcp_endpoint_complete_connect] connect() to 
> 192.168.122.1 failed: Connection refused (111)
> 
> In the hostsfile, I use following to do cpu mapping 
> rank 0=intel01 slot=0
> rank 1=intel02 slot=1
> 
> This file is different from the hosts file that I do with mpurun --hostfile 
> hosts hostname, which reads like
> intel01
> intel02
> ......
> 
> 2010-03-25 04:33:24, "Jeff Squyres" <jsquy...@cisco.com> wrote:
> 
> >Can you mpirun non-MPI applications, like "hostname"?  I frequently run this 
> >as a first step to debugging a wonky install.&! nbsp; For example: > >shell$ 
> >hostname >barney >shell$ mpirun hostname >barney >shell$ cat hosts >barney 
> >>rubble >shell$ mpirun --hostfile hosts hostname >barney >rubble >shell$ > > 
> >>On Mar 24, 2010, at 4:28 PM, haoanyi wrote: > >> Hi,  >>  >> I installed 
> >OpenMPI1.4.1 as a non-root user on a cluster. It is totally OK when I run 
> >with mpirun or mpiexec on one single node for many processes. However, when 
> >I lauch many processes on multiple nodes, I can observe jobs are distributed 
> >to those nodes (by using "top"), but all the jobs just hang there and cannot 
> >finish. >>  >> I think the nodes use TCP to communicate with each other. 
> >This cluster also provides MPICH2, which was configured by the sys admin., 
> >and has no problem to do node communication in MPICH2. Besides, I read from 
> >some posts, which says this may be caused by TCP firewall. Since I have no 
> >root's right, and I don't know what shall request the admin. to do to fix 
> >this problem. So, can you tell me how to do that either by the admin root or 
> >by the non-root user (if possible)? >>  >> Thank you very much. >> Hao >>  
> >>>  >> _______________________________________________ >> users mailing list 
> >>> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users 
> >> > >--  >Jeff Squyres >jsquy...@cisco.com >For corporate legal information 
> >go to: >http://www.cisco.com/web/about/doing_business/legal/cri/ > > 
> >>_______________________________________________ >users mailing list 
> >>us...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/users 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to