Hi Reuti,

> your local machine is Linux like, but the execution hosts
> are Macs? I saw the /Users/tsakai/... in your output.

No, my environment is entirely linux.  The path to my home
directory on one host (blitzen) has been known as /Users/tsakai,
despite it is an nfs mount from vixen (which is known to
itself as /home/tsakai).  For historical reasons, I have
chosen to give a symbolic link named /Users to vixen's /Home,
so that I can use consistent path for both vixen and blitzen.

> Is this a private cluster (or at least private interfaces)?
> It would also be an option to use hostbased authentication,
> which will avoid setting any known_hosts file or passphraseless
> ssh-keys for each user.

No, it is not a private cluster.  It is Amazon EC2.  When I
Ssh from my local machine (vixen) I use its public interface,
but to address from one amazon cluster node to the other I
use nodes' private dns names: domU-12-31-39-07-35-21 and
domU-12-31-39-06-74-E2.  Both public and private dns names
change from a launch to another.  I am using passphrasesless
ssh-keys for authentication in all cases, i.e., from vixen to
Amazon node A, from amazon node A to amazon node B, and from
Amazon node B back to A.  (Please see my initail post.  There
is a session dialogue for this.)  They all work without authen-
tication dialogue, except a brief initial dialogue:
    The authenticity of host 'domu-xx-xx-xx-xx-xx-x (10.xx.xx.xx)'
    can't be established.
    RSA key fingerprint is e3:ad:75:b1:a4:63:7f:0f:c4:0b:10:71:f3:2f:21:81.
    Are you sure you want to continue connecting (yes/no)?
to which I say "yes."
But I am unclear with what you mean by "hostbased authentication"?
Doesn't that mean with password?  If so, it is not an option.

Regards,

Tena


On 2/10/11 2:27 AM, "Reuti" <re...@staff.uni-marburg.de> wrote:

> Hi,
> 
> your local machine is Linux like, but the execution hosts are Macs? I saw the
> /Users/tsakai/... in your output.
> 
> a) executing a command on them is also working, e.g.: ssh
> domU-12-31-39-07-35-21 ls
> 
> Am 10.02.2011 um 07:08 schrieb Tena Sakai:
> 
>> Hi,
>> 
>> I have made a bit of progress(?)...
>> I made a config file in my .ssh directory on the cloud.  It looks like:
>>     # machine A
>>     Host domU-12-31-39-07-35-21.compute-1.internal
> 
> This is just an abbreviation or nickname above. To use the specified settings,
> it's necessary to specify exactly this name. When the settings are the same
> anyway for all machines, you can use:
> 
> Host *
>     IdentityFile /home/tsakai/.ssh/tsakai
>     IdentitiesOnly yes
>     BatchMode yes
> 
> instead.
> 
> Is this a private cluster (or at least private interfaces)? It would also be
> an option to use hostbased authentication, which will avoid setting any
> known_hosts file or passphraseless ssh-keys for each user.
> 
> -- Reuti
> 
> 
>>     HostName domU-12-31-39-07-35-21
>>     BatchMode yes
>>     IdentityFile /home/tsakai/.ssh/tsakai
>>     ChallengeResponseAuthentication no
>>     IdentitiesOnly yes
>> 
>>     # machine B
>>     Host domU-12-31-39-06-74-E2.compute-1.internal
>>     HostName domU-12-31-39-06-74-E2
>>     BatchMode yes
>>     IdentityFile /home/tsakai/.ssh/tsakai
>>     ChallengeResponseAuthentication no
>>     IdentitiesOnly yes
>> 
>> This file exists on both machine A and machine B.
>> 
>> Now When I issue mpirun command as below:
>>     [tsakai@domU-12-31-39-06-74-E2 ~]$ mpirun -app app.ac2
>> 
>> It hungs.  I control-C out of it and I get:
>>     mpirun: killing job...
>> 
>>     
>> --------------------------------------------------------------------------
>>     mpirun noticed that the job aborted, but has no info as to the process
>>     that caused that situation.
>>     
>> --------------------------------------------------------------------------
>>     
>> --------------------------------------------------------------------------
>>     mpirun was unable to cleanly terminate the daemons on the nodes shown
>>     below. Additional manual cleanup may be required - please refer to
>>     the "orte-clean" tool for assistance.
>>     
>> --------------------------------------------------------------------------
>>         domU-12-31-39-07-35-21.compute-1.internal - daemon did not report
>> back when launched
>> 
>> Am I making progress?
>> 
>> Does this mean I am past authentication and something else is the problem?
>> Does someone have an example .ssh/config file I can look at?  There are so
>> many keyword-argument paris for this config file and I would like to look at
>> some very basic one that works.
>> 
>> Thank you.
>> 
>> Tena Sakai
>> tsa...@gallo.ucsf.edu
>> 
>> On 2/9/11 7:52 PM, "Tena Sakai" <tsa...@gallo.ucsf.edu> wrote:
>> 
>>> Hi
>>> 
>>> I have an app.ac1 file like below:
>>>     [tsakai@vixen local]$ cat app.ac1
>>>     -H vixen.egcrc.org   -np 1 Rscript
>>> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 5
>>>     -H vixen.egcrc.org   -np 1 Rscript
>>> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 6
>>>     -H blitzen.egcrc.org -np 1 Rscript
>>> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 7
>>>     -H blitzen.egcrc.org -np 1 Rscript
>>> /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R 8
>>> 
>>> The program I run is
>>>     Rscript /Users/tsakai/Notes/R/parallel/Rmpi/local/fib.R x
>>> Where x is [5..8].  The machines vixen and blitzen each run 2 runs.
>>> 
>>> Here¹s the program fib.R:
>>>     [ tsakai@vixen local]$ cat fib.R
>>>         # fib() computes, given index n, fibonacci number iteratively
>>>         # here's the first dozen sequence (indexed from 0..11)
>>>         # 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89
>>> 
>>>     fib <- function( n ) {
>>>             a <- 0
>>>             b <- 1
>>>             for ( i in 1:n ) {
>>>                  t <- b
>>>                  b <- a
>>>                  a <- a + t
>>>             }
>>>         a
>>> 
>>>     arg <- commandArgs( TRUE )
>>>     myHost <- system( 'hostname', intern=TRUE )
>>>     cat( fib(arg), myHost, '\n' )
>>> 
>>> It reads an argument from command line and produces a fibonacci number that
>>> corresponds to that index, followed by the machine name.  Pretty simple
>>> stuff.
>>> 
>>> Here¹s the run output:
>>>     [tsakai@vixen local]$ mpirun -app app.ac1
>>>     5 vixen.egcrc.org
>>>     8 vixen.egcrc.org
>>>     13 blitzen.egcrc.org
>>>     21 blitzen.egcrc.org
>>> 
>>> Which is exactly what I expect.  So far so good.
>>> 
>>> Now I want to run the same thing on cloud.  I launch 2 instances of the same
>>> virtual machine, to which I get to by:
>>>     [tsakai@vixen local]$ ssh ­A ­I ~/.ssh/tsakai
>>> machine-instance-A-public-dns
>>> 
>>> Now I am on machine A:
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ # and I can go to machine B without
>>> password authentication,
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ # i.e., use public/private key
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
>>>     domU-12-31-39-00-D1-F2
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ ssh -i .ssh/tsakai
>>> domU-12-31-39-0C-C8-01
>>>     Last login: Wed Feb  9 20:51:48 2011 from 10.254.214.4
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$ # I am now on machine B
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$ hostname
>>>     domU-12-31-39-0C-C8-01
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$ # now show I can get to machine A
>>> without using password
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$ ssh -i .ssh/tsakai
>>> domU-12-31-39-00-D1-F2
>>>     The authenticity of host 'domu-12-31-39-00-d1-f2 (10.254.214.4)' can't
>>> be established.
>>>     RSA key fingerprint is e3:ad:75:b1:a4:63:7f:0f:c4:0b:10:71:f3:2f:21:81.
>>>     Are you sure you want to continue connecting (yes/no)? yes
>>>     Warning: Permanently added 'domu-12-31-39-00-d1-f2' (RSA) to the list of
>>> known hosts.
>>>     Last login: Wed Feb  9 20:49:34 2011 from 10.215.203.239
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
>>>     domU-12-31-39-00-D1-F2
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ exit
>>>     logout
>>>     Connection to domU-12-31-39-00-D1-F2 closed.
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$
>>>     [tsakai@domU-12-31-39-0C-C8-01 ~]$ exit
>>>     logout
>>>     Connection to domU-12-31-39-0C-C8-01 closed.
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ # back at machine A
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ hostname
>>>     domU-12-31-39-00-D1-F2
>>> 
>>> As you can see, neither machine uses password for authentication; it uses
>>> public/private key pairs.  There is no problem (that I can see) for ssh
>>> invocation
>>> from one machine to the other.  This is so because I have a copy of public
>>> key
>>> and a copy of private key on each instance.
>>> 
>>> The app.ac file is identical, except the node names:
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ cat app.ac1
>>>     -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 5
>>>     -H domU-12-31-39-00-D1-F2 -np 1 Rscript /home/tsakai/fib.R 6
>>>     -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 7
>>>     -H domU-12-31-39-0C-C8-01 -np 1 Rscript /home/tsakai/fib.R 8
>>> 
>>> Here¹s what happens with mpirun:
>>> 
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$ mpirun -app app.ac1
>>>     tsakai@domu-12-31-39-0c-c8-01's password:
>>>     Permission denied, please try again.
>>>     tsakai@domu-12-31-39-0c-c8-01's password: mpirun: killing job...
>>> 
>>>     
>>> --------------------------------------------------------------------------
>>>     mpirun noticed that the job aborted, but has no info as to the process
>>>     that caused that situation.
>>>     
>>> --------------------------------------------------------------------------
>>> 
>>>     mpirun: clean termination accomplished
>>> 
>>>     [tsakai@domU-12-31-39-00-D1-F2 ~]$
>>> 
>>> Mpirun (or somebody else?) asks me password, which I don¹t have.
>>> I end up typing control-C.
>>> 
>>> Here¹s my question:
>>> How can I get past authentication by mpirun where there is no password?
>>> 
>>> I would appreciate your help/insight greatly.
>>> 
>>> Thank you.
>>> 
>>> Tena Sakai
>>> tsa...@gallo.ucsf.edu
>>> 
>>> 
>>> 
>>> 
>>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to