Am 12.11.2014 um 17:27 schrieb Reuti:

> Am 11.11.2014 um 02:25 schrieb Ralph Castain:
> 
>> Another thing you can do is (a) ensure you built with —enable-debug, and 
>> then (b) run it with -mca oob_base_verbose 100  (without the tcp_if_include 
>> option) so we can watch the connection handshake and see what it is doing. 
>> The —hetero-nodes will have not affect here and can be ignored.
> 
> Done. It really tries to connect to the outside interface of the headnode. 
> But being there a firewall or not: the nodes have no clue how to reach 
> 137.248.0.0 - they have no gateway to this network at all.

I have to revert this. They think that there is a gateway although it isn't. 
When I remove the entry by hand for the gateway in the routing table it starts 
up instantly too.

While I can do this on my own cluster I still have the 30 seconds delay on a 
cluster where I'm not root, while this can be because of the firewall there. 
The gateway on this cluster is indeed going to the outside world.

Personally I find this behavior a little bit too aggressive to use all 
interfaces. If you don't check this carefully beforehand and start a long 
running application one might even not notice the delay during the startup.

-- Reuti


> It tries so independent from the internal or external name of the headnode 
> given in the machinefile - I hit ^C then. I attached the output of Open MPI 
> 1.8.1 for this setup too.
> 
> -- Reuti
> 
> <openmpi1.8.3.txt><openmpi1.8.1.txt>_______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25777.php

Reply via email to