fs -put crash that depends on source file name
----------------------------------------------

                 Key: HDFS-1768
                 URL: https://issues.apache.org/jira/browse/HDFS-1768
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client, name-node
    Affects Versions: 0.20.2
         Environment: Cloudera CDH3B4 in pseudo mode on a Linux 
2.6.32-28-generic #55-Ubuntu SMP x86_64 kernel, with Java HotSpot64-Bit Server 
VM (build 19.1-b02, mixed mode)
            Reporter: Lars Ailo Bongo
            Priority: Minor


I have a unit test that includes writing a file to HDFS using 
copyFromLocalFile. Sometimes the function fails due to a checksum error. Once 
the issue has occurred "hadoop -put <filename> <anywhere>" also fails as long 
as the filename is the same as used in the unit test. The error is due to the 
file content never being sent to the DataNode, hence the file is size zero. 

The error is not due to the file content. The error does not depend on the HDFS 
destination name. Restarting the NameNode and DataNode does not resolve the 
issue. I have not been able to reproduce the error with a simple program. I 
have also not tested the issue in distributed or standalone mode.

The only "fix" is to change the source filename.

Below is error and the NameNode log. There is no entry for this operation in 
the DataNode log.

/home/larsab/troilkatt2/test-tmp/data>hadoop fs -put status-test.txt 
status-test.txt3
11/03/18 16:59:54 INFO fs.FSInputChecker: Found checksum error: b[512, 
968]=3a646f6e650a323a7365636f6e6453746167653a73746172740a323a7365636f6e6453746167653a646f6e650a323a746869726453746167653a73746172740a323a746869726453746167653a646f6e650a323a74686553696e6b3a73746172740a323a74686553696e6b3a646f6e650a323a54726f696c6b6174743a646f6e650a333a54726f696c6b6174743a73746172740a333a746865536f757263653a73746172740a333a746865536f757263653a646f6e650a333a666972737453746167653a73746172740a333a666972737453746167653a646f6e650a333a7365636f6e6453746167653a73746172740a333a7365636f6e6453746167653a646f6e650a333a746869726453746167653a73746172740a333a746869726453746167653a646f6e650a333a74686553696e6b3a73746172740a333a74686553696e6b3a646f6e650a333a54726f696c6b6174743a646f6e650a343a54726f696c6b6174743a73746172740a343a746865536f757263653a73746172740a343a746865536f757263653a646f6e650a343a666972737453746167653a73746172740a343a666972737453746167653a646f6e650a343a7365636f6e6453746167653a7265636f7665720a
org.apache.hadoop.fs.ChecksumException: Checksum error: status-test.txt at 512
        at 
org.apache.hadoop.fs.FSInputChecker.verifySum(FSInputChecker.java:277)
        at 
org.apache.hadoop.fs.FSInputChecker.readChecksumChunk(FSInputChecker.java:241)
        at org.apache.hadoop.fs.FSInputChecker.read1(FSInputChecker.java:189)
        at org.apache.hadoop.fs.FSInputChecker.read(FSInputChecker.java:158)
        at java.io.DataInputStream.read(DataInputStream.java:83)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:49)
        at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:87)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:224)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:170)
        at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:1283)
        at org.apache.hadoop.fs.FsShell.copyFromLocal(FsShell.java:134)
        at org.apache.hadoop.fs.FsShell.run(FsShell.java:1817)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
        at org.apache.hadoop.fs.FsShell.main(FsShell.java:1960)
put: Checksum error: status-test.txt at 512

NAMENODE
2011-03-18 16:59:54,422 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Number of transactions: 13 
Total time for transactions(ms): 1Number of transactions batched in Syncs: 0 
Number of syncs: 7 SyncTimes(ms): 220 
2011-03-18 16:59:54,444 INFO 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=larsab      
ip=/127.0.0.1   cmd=create      src=/user/larsab/status-test.txt3       
dst=null        perm=larsab:supergroup:rw-r--r--
2011-03-18 16:59:54,469 INFO org.apache.hadoop.hdfs.StateChange: Removing lease 
on  file /user/larsab/status-test.txt3 from client DFSClient_-1004170418
2011-03-18 16:59:54,469 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
NameSystem.completeFile: file /user/larsab/status-test.txt3 is closed by 
DFSClient_-1004170418

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to