Chandan Biswas created HDFS-8714:
------------------------------------

             Summary: Folder ModificationTime in Millis Changed When NameNode 
is restarted
                 Key: HDFS-8714
                 URL: https://issues.apache.org/jira/browse/HDFS-8714
             Project: Hadoop HDFS
          Issue Type: Bug
            Reporter: Chandan Biswas


*Steps to Produce*
# Steps need to do in program
** Create a folder into HDFS 
** Print folder modificationTime in millis
** Upload a file or copy a file to this newly created folder
** Print file and folder modificationTime in millis
** Restart the name node
** Print file and folder modificationTime in millis
# Expected Result
** folder modification time should be the file modification time before name 
node restart
** folder modification time should not change after name node restart
# Actual result
** folder modification time is not same with file modification time
** folder modification time is changed after name node restart and it's changed 
to file modification time

*Impact of this behavior:* Before task is launched, distributed cache 
files/folders are checked for any modification. The checks are done by 
comparing file/folder modicationTime in millis. So any job that uses 
distributed cache has a potential chance of failure if 
# name node restarts and running tasks are resubmitted or 
# for e.g among 100 tasks 50 are in queue for run. Now name node restarts

Here is the sample code I used for testing-
{code}
// file creating in hdfs
        final Path pathToFiles = new Path("/user/vagrant/chandan/test/");
        fileSystem.mkdirs(pathToFiles);
        System.out.println("HDFS Folder Modification Time in long Before file 
copy:"
                + fileSystem.getFileStatus(pathToFiles).getModificationTime());
        FileUtil.copy(fileSystem, new Path("/user/cloudera/test"), fileSystem, 
pathToFiles, false, configuration);
        System.out.println("HDFS File Modification Time in long:"
                + fileSystem.getFileStatus(new 
Path("/user/vagrant/chandan/test/test")).getModificationTime());
        System.out.println("HDFS Folder Modification Time in long After file 
copy:"
                + fileSystem.getFileStatus(pathToFiles).getModificationTime());

        for (int i = 0; i < 100; i++) {
            System.out.println("Normal HDFS Folder Modification Time in long:"
                    + 
fileSystem.getFileStatus(pathToFiles).getModificationTime());
            System.out.println("Normal HDFS File Modification Time in long:"
                    + fileSystem.getFileStatus(new 
Path("/user/vagrant/chandan/test/test")).getModificationTime());
            Thread.sleep(60000 * 2);
        }
{code}
Here is the output -
{code}
HDFS Folder Modification Time in long Before file copy:1435868217309
HDFS File Modification Time in long:1435868217368
HDFS Folder Modification Time in long After file copy:1435868217353
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217353
Normal HDFS File Modification Time in long:1435868217368
Normal HDFS Folder Modification Time in long:1435868217368
Normal HDFS File Modification Time in long:1435868217368
{code}
The last two lines are printed after name node restart.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to