Hi,

Am 12.03.2014 um 07:37 schrieb Victor:

> I am using openmpi 1.7.4 on Ubuntu 12.04 x64 and I have a very odd problem.
> 
> I have 4 nodes, all of which are defined in the hostfile and in /etc/hosts.
> 
> I can log into each node using ssh and certificate method from the shell that 
> is running the mpi job, by sing their name as defined in /etc/hosts.
> 
> I can run an mpi job if I include only 3 nodes in the hostfile, for example:
> 
> Node1 slots=8 max-slots=8
> Node2 slots=8 max-slots=8
> Node3 slots=8 max-slots=8

You are using an uppercase name here by intention - this is the one the host 
returns by `hostname`? Although it is allowed and should be mangled to 
lowercase resp. ignored for hostname resolution, I found that not all programs 
are doing it. Best is to use only lowercase characters is my experience.

The same version of your Ubuntu Linux is installed on all machines?

-- Reuti


> But if I add a fourth node into the hostfile eg:
> 
> Node1 slots=8 max-slots=8
> Node2 slots=8 max-slots=8
> Node3 slots=8 max-slots=8
> Node4 slots=8 max-slots=8
> 
> I get this error after attempting mpirun -np 32 --hostfile hostfile a.out:
> 
> ssh: Could not resolve hostname Node4: Name or service not known.
> 
> But, I can log into Node4 using ssh from the same shell by using ssh Node4.
> 
> Also if I mix up the hostfile like this for example and place Node1 to the 
> last spot:
> 
> Node4 slots=8 max-slots=8
> Node2 slots=8 max-slots=8
> Node3 slots=8 max-slots=8
> Node1 slots=8 max-slots=8
> 
> The error becomes 
> 
> ssh: Could not resolve hostname Node1: Name or service not known.
> 
> If I then go back to the three node hostfile like this:
> 
> Node1 slots=8 max-slots=8
> Node4 slots=8 max-slots=8
> Node2 slots=8 max-slots=8
> 
> There is no error with three nodes even though both Node1 and Node4 "cannot 
> be found" if they are present in a 4 node hostfile in the last spot. The last 
> slot seems to be bugged.
> 
> What is going on? How do I fix this?
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to