Jarek Jarcec Cecho created SQOOP-2709:
-----------------------------------------

             Summary: Sqoop2: HDFS: Impersonation on secured cluster doesn't 
work
                 Key: SQOOP-2709
                 URL: https://issues.apache.org/jira/browse/SQOOP-2709
             Project: Sqoop
          Issue Type: Bug
            Reporter: Jarek Jarcec Cecho
            Assignee: Jarek Jarcec Cecho
             Fix For: 1.99.7


Using HDFS connector on secured cluster currently doesn't work with following 
exception:

{code}
2015-11-19 13:24:30,624 [OutputFormatLoader-consumer] ERROR 
org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor  - Error while loading 
data out of MR job.
org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0005:Error 
occurs during loader run
        at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:119)
        at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:60)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:60)
        at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:44)
        at 
org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:267)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: 
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: 
No valid credentials provided (Mechanism level: Failed to find any Kerberos 
tgt)]; Host Details : local host is: 
"sqoopkrb-4.vpc.cloudera.com/172.28.211.196"; destination host is: 
"sqoopkrb-1.vpc.cloudera.com":8020; 
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
        at org.apache.hadoop.ipc.Client.call(Client.java:1476)
        at org.apache.hadoop.ipc.Client.call(Client.java:1403)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
        at com.sun.proxy.$Proxy15.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
        at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
        at com.sun.proxy.$Proxy16.create(Unknown Source)
        at 
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1867)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1737)
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1662)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:404)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
        at 
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
        at 
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
        at 
org.apache.sqoop.connector.hdfs.hdfsWriter.HdfsTextWriter.initialize(HdfsTextWriter.java:40)
        at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:93)
        ... 12 more
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate 
failed [Caused by GSSException: No valid credentials provided (Mechanism level: 
Failed to find any Kerberos tgt)]
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733)
        at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
        at org.apache.hadoop.ipc.Client.call(Client.java:1442)
        ... 36 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by 
GSSException: No valid credentials provided (Mechanism level: Failed to find 
any Kerberos tgt)]
        at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
        at 
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
        at 
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555)
        at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720)
        ... 39 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed 
to find any Kerberos tgt)
        at 
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
        at 
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
        at 
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
        at 
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
        at 
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
        at 
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
        ... 48 more
{code}

It's very long exception, but the gist of it is here:

{code}
Host Details : local host is: "sqoopkrb-4.vpc.cloudera.com/172.28.211.196"; 
destination host is: "sqoopkrb-1.vpc.cloudera.com":8020;
{code}

We've triaged it with [~abrahamfine] to the fact that we're doing the 
impersonation exactly the same way on the Sqoop 2 server side and as the mapper 
side. However on mapper side we no longer have kerberos ticket - we have only 
delegation token for {{sqoop2}} user. [Hadoop documentation 
contains|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html]
 this very relevant snipnet:

{quote}
If the cluster is running in Secure Mode, the superuser must have kerberos 
credentials to be able to impersonate another user. It cannot use delegation 
tokens for this feature. 
{quote}

Hence in order to do impersonation properly on secured cluster, we will have to 
do some dark magic with delegation tokens and retrieve DT for the end user 
inside the HDFS initialization and pass them to the execution engine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to