Jarek Jarcec Cecho created SQOOP-2709:
-----------------------------------------
Summary: Sqoop2: HDFS: Impersonation on secured cluster doesn't
work
Key: SQOOP-2709
URL: https://issues.apache.org/jira/browse/SQOOP-2709
Project: Sqoop
Issue Type: Bug
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Fix For: 1.99.7
Using HDFS connector on secured cluster currently doesn't work with following
exception:
{code}
2015-11-19 13:24:30,624 [OutputFormatLoader-consumer] ERROR
org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor - Error while loading
data out of MR job.
org.apache.sqoop.common.SqoopException: GENERIC_HDFS_CONNECTOR_0005:Error
occurs during loader run
at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:119)
at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:60)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:60)
at org.apache.sqoop.connector.hdfs.HdfsLoader.load(HdfsLoader.java:44)
at
org.apache.sqoop.job.mr.SqoopOutputFormatLoadExecutor$ConsumerThread.run(SqoopOutputFormatLoadExecutor.java:267)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Failed on local exception: java.io.IOException:
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Failed to find any Kerberos
tgt)]; Host Details : local host is:
"sqoopkrb-4.vpc.cloudera.com/172.28.211.196"; destination host is:
"sqoopkrb-1.vpc.cloudera.com":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1476)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy15.create(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:295)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy16.create(Unknown Source)
at
org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1867)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1737)
at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1662)
at
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:404)
at
org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:400)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:400)
at
org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:343)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:917)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:898)
at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:795)
at
org.apache.sqoop.connector.hdfs.hdfsWriter.HdfsTextWriter.initialize(HdfsTextWriter.java:40)
at org.apache.sqoop.connector.hdfs.HdfsLoader$1.run(HdfsLoader.java:93)
... 12 more
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate
failed [Caused by GSSException: No valid credentials provided (Mechanism level:
Failed to find any Kerberos tgt)]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:682)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at
org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:645)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:733)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:370)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1525)
at org.apache.hadoop.ipc.Client.call(Client.java:1442)
... 36 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by
GSSException: No valid credentials provided (Mechanism level: Failed to find
any Kerberos tgt)]
at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212)
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:413)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:555)
at org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:370)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:721)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:720)
... 39 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Failed
to find any Kerberos tgt)
at
sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147)
at
sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121)
at
sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187)
at
sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212)
at
sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179)
at
com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193)
... 48 more
{code}
It's very long exception, but the gist of it is here:
{code}
Host Details : local host is: "sqoopkrb-4.vpc.cloudera.com/172.28.211.196";
destination host is: "sqoopkrb-1.vpc.cloudera.com":8020;
{code}
We've triaged it with [~abrahamfine] to the fact that we're doing the
impersonation exactly the same way on the Sqoop 2 server side and as the mapper
side. However on mapper side we no longer have kerberos ticket - we have only
delegation token for {{sqoop2}} user. [Hadoop documentation
contains|https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html]
this very relevant snipnet:
{quote}
If the cluster is running in Secure Mode, the superuser must have kerberos
credentials to be able to impersonate another user. It cannot use delegation
tokens for this feature.
{quote}
Hence in order to do impersonation properly on secured cluster, we will have to
do some dark magic with delegation tokens and retrieve DT for the end user
inside the HDFS initialization and pass them to the execution engine.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)