That explains, thank you for the quick answer.
2013/11/11 Ralph Castain <r...@open-mpi.org> > IIRC, 1.6.5 defaults to *not* using the tree spawn. We changed it in 1.7 > series because the launch performance is so much better. > > > On Nov 11, 2013, at 8:22 AM, Christoffer Hamberg < > christoffer.hamb...@gmail.com> wrote: > > I re-configured the ssh keys now and for some reason it seems to work. But > what baffles me is that the same ssh configuration worked for the other > installation (1.6.5) but not for this one. > > Thanks for the help! > > > 2013/11/11 Reuti <re...@staff.uni-marburg.de> > >> Am 11.11.2013 um 10:04 schrieb Christoffer Hamberg: >> >> > (Correction; I mixed up the output of the two first examples in my >> first mail, so it fails on the first one) >> > >> > ubuntu@node0:~$ mpirun --leave-session-attached -mca plm_base_verbose >> 5 -np 4 -host node0,node1,node2,node3 hostname >> > [node0:01486] mca:base:select:( plm) Querying component [slurm] >> > [node0:01486] mca:base:select:( plm) Skipping component [slurm]. Query >> failed to return a module >> > [node0:01486] mca:base:select:( plm) Querying component [rsh] >> > [node0:01486] mca:base:select:( plm) Query of component [rsh] set >> priority to 10 >> > [node0:01486] mca:base:select:( plm) Selected component [rsh] >> > [node2:26962] mca:base:select:( plm) Querying component [rsh] >> > [node2:26962] mca:base:select:( plm) Query of component [rsh] set >> priority to 10 >> > [node2:26962] mca:base:select:( plm) Selected component [rsh] >> > [node1:11477] mca:base:select:( plm) Querying component [rsh] >> > [node1:11477] mca:base:select:( plm) Query of component [rsh] set >> priority to 10 >> > [node1:11477] mca:base:select:( plm) Selected component [rsh] >> > Host key verification failed. >> > >> > >> > ubuntu@node0:~$ mpirun -mca plm_rsh_no_tree_spawn 1 -np 4 -host >> node0,node1,node2,node3 hostname >> > node0 >> > node1 >> > node2 >> > node3 >> > >> > So it definetely looks like a problem with the tree spawn. Any clue how >> I could proceed? >> >> The passphraseless ssh is also possible between the nodes? Using >> hostbased authentication it's also possible to enable it for all users >> without the necessity to prepare the ssh keys. >> >> -- Reuti >> >> >> > /Christoffer >> > >> > >> > 2013/11/11 Ralph Castain <r...@open-mpi.org> >> > Add --enable-debug to your configure and run it with the following >> additional options >> > >> > --leave-session-attached -mca plm_base_verbose 5 >> > >> > Let's see where it fails during the launch phase. Offhand, the only >> thing that message means to me is that the ssh keys are botched on at least >> one node. Keep in mind that we use a tree-based launch, and so when you >> have more than two nodes, one or more of the intermediate nodes are >> executing an ssh. >> > >> > One way to see if that's the problem is to launch without the tree >> spawn: add >> > >> > -mca plm_rsh_no_tree_spawn 1 >> > >> > to your cmd line and see if it works. >> > >> > >> > >> > On Nov 10, 2013, at 9:24 AM, Christoffer Hamberg < >> christoffer.hamb...@gmail.com> wrote: >> > >> >> Hi, >> >> >> >> I'm having some strange problems running Open MPI(1.9a1r29559) with >> Java bindings on a Calxeda highbank ARM Server running Ubuntu 12.10 >> (GNU/Linux 3.5.0-43-highbank armv7l). >> >> >> >> The problem arises when I try to run a job on more than 3 nodes (I >> have a total of 8). >> >> Note: It's the same error for any of the node[0-7]. >> >> >> >> ubuntu@node0:~$ mpirun -np 4 -host node0,node1,node2 hostname >> >> Host key verification failed. >> >> >> >> ubuntu@node0:~$ mpirun -np 4 -host node0,node1,node2,node3 hostname >> >> node0 >> >> node0 >> >> node1 >> >> node2 >> >> >> >> and not running the job on the current node also gives Host key >> verification failed for only 3 nodes. >> >> >> >> ubuntu@node0:~$ mpirun -np 4 -host node1,node3,node5 hostname >> >> Host key verification failed. >> >> >> >> But not on 2 nodes: >> >> ubuntu@node0:~$ mpirun -np 4 -host node1,node3 hostname >> >> node1 >> >> node1 >> >> node3 >> >> node3 >> >> >> >> I've configured it with the following: >> >> ./configure --prefix=/opt/openmpi-1.9-java --without-openib >> --enable-static --with-threads=posix --enable-mpi-thread-multiple >> --enable-mpi-java --with-jdk-bindir=/usr/lib/jvm/java-7-openjdk-armhf/bin >> --with-jdk-headers=/usr/lib/jvm/java-7-openjdk-armhf/include >> >> >> >> I have Open MPI 1.6.5 (without Java-binding) installed and it runs >> without any problems on all nodes, so there should be no problem with SSH >> that the error points to. >> >> >> >> Any ideas? >> >> >> >> Regards, >> >> Christoffer >> >> _______________________________________________ >> >> users mailing list >> >> us...@open-mpi.org >> >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> > >> > _______________________________________________ >> > users mailing list >> > us...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >