Broken symlinks break the local filesystem implementation of HDFS on linux --------------------------------------------------------------------------
Key: HDFS-1412 URL: https://issues.apache.org/jira/browse/HDFS-1412 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2 Environment: Ubuntu 9.10 Lucid Reporter: Ed Kohlwey When calling listStatus() on a directory containing broken symlinks, a FileNotFound exception is thrown. The problem is that even though File.list() returns an entry for the symlink, calling File.exists() on a file with that path will return false. I would suggest either checking that files exist in RawLocalFileSystem.listStatus() before calling getFileStatus(), and if a file doesn't exist don't put in in the result (less work), or modify the RawLocalFileSystem implementation to treat broken symlinks as files of length 0 (more work), or add an exists() method to FileStatus (perhaps even more work, and involves changing the API). Here's the relevant section of the stack trace. java.io.FileNotFoundException: File XXX does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:361) at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:290) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:721) at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:746) at org.apache.hadoop.fs.ChecksumFileSystem.listStatus(ChecksumFileSystem.java:465) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.