Daryn Sharp created HDFS-4649: --------------------------------- Summary: Webhdfs cannot list large directories Key: HDFS-4649 URL: https://issues.apache.org/jira/browse/HDFS-4649 Project: Hadoop HDFS Issue Type: Bug Components: namenode, security, webhdfs Affects Versions: 2.0.0-alpha, 0.23.0, 3.0.0 Reporter: Daryn Sharp Assignee: Daryn Sharp Priority: Blocker
Webhdfs returns malformed json for directories that exceed the conf {{dfs.ls.limit}} value. The streaming object returned by {{NamenodeWebhdfsMethods#getListingStream}} will repeatedly call {{getListing}} for each segment of the directory listing. {{getListingStream}} runs within the remote user's ugi and acquires the first segment of the directory, then returns a streaming object. The streaming object is later executed _outside of the user's ugi_. Luckily it runs as the host service principal (ie. {{host/namenode@REALM}}) so the result is permission denied for the "host" user: {noformat} org.apache.hadoop.security.AccessControlException: Permission denied: user=host, access=EXECUTE, inode="/path":someuser:group:drwx------ {noformat} The exception causes the streamer to prematurely abort the json output leaving it malformed. Meanwhile, the client sees the cryptic: {noformat} java.lang.IllegalStateException: unexpected end of array at org.mortbay.util.ajax.JSON.parseArray(JSON.java:902) [...] at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.jsonParse(WebHdfsFileSystem.java:242) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.run(WebHdfsFileSystem.java:441) at org.apache.hadoop.hdfs.web.WebHdfsFileSystem.listStatus(WebHdfsFileSystem.java:717) [...] {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira