Eric Yang created HADOOP-16356:
----------------------------------

             Summary: Distcp with webhdfs is not working with 
ProxyUserAuthenticationFilter or AuthenticationFilter
                 Key: HADOOP-16356
                 URL: https://issues.apache.org/jira/browse/HADOOP-16356
             Project: Hadoop Common
          Issue Type: Sub-task
            Reporter: Eric Yang


When distcp is running with webhdfs://, there is no delegation token issued to 
mapreduce task because mapreduce task does not have kerberos tgt ticket.

This stack trace was thrown when mapreduce task contacts webhdfs:

{code}
Error: org.apache.hadoop.security.AccessControlException: Authentication 
required
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.validateResponse(WebHdfsFileSystem.java:492)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.access$200(WebHdfsFileSystem.java:136)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.connect(WebHdfsFileSystem.java:760)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.runWithRetry(WebHdfsFileSystem.java:835)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.access$100(WebHdfsFileSystem.java:663)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner$1.run(WebHdfsFileSystem.java:701)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem$AbstractRunner.run(WebHdfsFileSystem.java:697)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getHdfsFileStatus(WebHdfsFileSystem.java:1095)
        at 
org.apache.hadoop.hdfs.web.WebHdfsFileSystem.getFileStatus(WebHdfsFileSystem.java:1106)
        at org.apache.hadoop.tools.mapred.CopyMapper.setup(CopyMapper.java:124)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:799)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1891)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
{code}

There are two proposals:

1. Have a API to issue delegation token to pass along to webhdfs to maintain 
backward compatibility.
2. Have mapreduce task login to kerberos then perform webhdfs fetching.





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org

Reply via email to