Eric Yang created HADOOP-16356: ---------------------------------- Summary: Distcp with webhdfs is not working with ProxyUserAuthenticationFilter or AuthenticationFilter Key: HADOOP-16356 URL: https://issues.apache.org/jira/browse/HADOOP-16356 Project: Hadoop Common Issue Type: Sub-task Reporter: Eric Yang
When distcp is running with webhdfs://, there is no delegation token issued to mapreduce task because mapreduce task does not have kerberos tgt ticket. This stack trace was thrown when mapreduce task contacts webhdfs: {code} Error: org.apache.hadoop.security.AccessControlException: Authentication required at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:492) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:136) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:760) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:835) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:663) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:701) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:697) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:1095) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:1106) at org.apache.hadoop.tools.mapred.CopyMapper.setup(CopyMapper.java:124) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172) {code} There are two proposals: 1. Have a API to issue delegation token to pass along to webhdfs to maintain backward compatibility. 2. Have mapreduce task login to kerberos then perform webhdfs fetching. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org