Okay, let's start with the basics :-) How was this configured? What environment are you running in (rsh, slurm, ??)? If you configured --enable-debug, then please run it with
--mca plm_base_verbose 5 --debug-daemons and send the output On Aug 11, 2014, at 12:07 AM, Lenny Verkhovsky <len...@mellanox.com> wrote: > I don’t think so, > It’s always the 66th node, even if I swap between 65th and 66th > I also get the same error when setting np=66, while having only 65 hosts in > hostfile > (I am using only tcp btl ) > > > Lenny Verkhovsky > SW Engineer, Mellanox Technologies > www.mellanox.com > > Office: +972 74 712 9244 > Mobile: +972 54 554 0233 > Fax: +972 72 257 9400 > > From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain > Sent: Monday, August 11, 2014 1:07 AM > To: Open MPI Users > Subject: Re: [OMPI users] OpenMPI fails with np > 65 > > Looks to me like your 65th host is missing the dstore library - is it > possible you don't have your paths set correctly on all hosts in your > hostfile? > > > On Aug 10, 2014, at 1:13 PM, Lenny Verkhovsky <len...@mellanox.com> wrote: > > > Hi all, > > Trying to run OpenMPI ( trunk Revision: 32428 ) I faced the problem running > OMPI with more than 65 procs. > It looks like MPI failes to open 66th connection even with running `hostname` > over tcp. > It also seems to unrelated to specific host. > All hosts are Ubuntu 12.04.1 LTS > mpirun -np 66 --hostfile /proj/SSA/Mellanox/tmp//20140810_070156_hostfile.txt > --mca btl tcp,self hostname > [nodename] [[4452,0],65] ORTE_ERROR_LOG: Error in file > base/ess_base_std_orted.c at line 288 > > ………………………………… > > It looks like environment issue, but I can’t find any limit related. > Any ideas ? > Thanks. > Lenny Verkhovsky > SW Engineer, Mellanox Technologies > www.mellanox.com > > Office: +972 74 712 9244 > Mobile: +972 54 554 0233 > Fax: +972 72 257 9400 > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/08/24961.php > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/08/24964.php