Okay, let's start with the basics :-)

How was this configured? What environment are you running in (rsh, slurm, ??)? 
If you configured --enable-debug, then please run it with 

--mca plm_base_verbose 5 --debug-daemons

and send the output


On Aug 11, 2014, at 12:07 AM, Lenny Verkhovsky <len...@mellanox.com> wrote:

> I don’t think so,
> It’s always the 66th node, even if I swap between 65th and 66th
> I also get the same error when setting np=66, while having only 65 hosts in 
> hostfile
> (I am using only tcp btl )
>  
>  
> Lenny Verkhovsky
> SW Engineer,  Mellanox Technologies
> www.mellanox.com
>  
> Office:    +972 74 712 9244
> Mobile:  +972 54 554 0233
> Fax:        +972 72 257 9400
>  
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain
> Sent: Monday, August 11, 2014 1:07 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] OpenMPI fails with np > 65
>  
> Looks to me like your 65th host is missing the dstore library - is it 
> possible you don't have your paths set correctly on all hosts in your 
> hostfile?
>  
>  
> On Aug 10, 2014, at 1:13 PM, Lenny Verkhovsky <len...@mellanox.com> wrote:
> 
> 
> Hi all,
>  
> Trying to run OpenMPI ( trunk Revision: 32428 ) I faced the problem running 
> OMPI with more than 65 procs.
> It looks like MPI failes to open 66th connection even with running `hostname` 
> over tcp.
> It also seems to unrelated to specific host.
> All hosts are Ubuntu 12.04.1 LTS
> mpirun -np 66 --hostfile /proj/SSA/Mellanox/tmp//20140810_070156_hostfile.txt 
> --mca btl tcp,self hostname
> [nodename] [[4452,0],65] ORTE_ERROR_LOG: Error in file 
> base/ess_base_std_orted.c at line 288
> 
> …………………………………
> 
> It looks like environment issue, but I can’t find any limit related.
> Any ideas ?
> Thanks.
> Lenny Verkhovsky
> SW Engineer,  Mellanox Technologies
> www.mellanox.com
>  
> Office:    +972 74 712 9244
> Mobile:  +972 54 554 0233
> Fax:        +972 72 257 9400
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/24961.php
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/08/24964.php

Reply via email to