Jeff Squyres <jsquyres <at> cisco.com> writes: > > On Nov 30, 2011, at 6:03 AM, Jaison Paul wrote: > > > Yes, we have set up .ssh file on remote EC2 hosts. Is there anything else that we should be taking care of when > dealing with EC2? > > I have heard that Open MPI's TCP latency on EC2 is horrid. I actually talked with some Amazon / EC2 folks about > it at SC'11 a few weeks ago; we set a date to dive into it a bit deeper in December. > > No promises on when/if the TCP latency will improve, but it's definitely something that we're looking at. > My first *guess* is that it might have something to do with specifying btl_tcp_if_include / > oob_tcp_if_include improperly (or not at all) -- but that's a SWAG. >
I have tried little bit more: I have set the MCA parameters as follows: mpirun -np 1 --mca btl tcp,self --mca btl_tcp_if_exclude lo,eth0 -hostfile hostinfo nbs-client -bynode But still failed and got the following error: Permission denied (publickey). -------------------------------------------------------------------------- A daemon (pid 24744) died unexpectedly with status 255 while attempting to launch so we are aborting. There may be more information reported by the environment (see above). This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -------------------------------------------------------------------------- mpirun: clean termination accomplished I dont understand the "Permission denied (publickey)" error. I access the EC2 instance using password-less ssh as follows: ssh ubuntu@ec2-67-202-**-***.compute-1.amazonaws.com So, what went wrong? hostinfo file is: [jmulerik@jaison Client]$ cat hostinfo localhost ubu...@ec2-67-202-48-118.compute-1.amazonaws.com Jaison