Hi All,


In one of our hadoop cluster we faced CheckSum file Corruption, due to which 
appending to the file failed.

If any one of you faced this problem earlier, please share your experiences.

We are using hadoop 0.20.1 with append feature.



Scenario:
===============
1. Created the file, written 305 bytes, closed the Stream.



2. Called append to same file and written 307 bytes and closed the stream.



3. Repeated the Step 2 with different bytes (311, 313, 307, 305, 313, 311, 307, 
311, 313, 307, 307, 307, 305, 307, 305, 290, 288, 305, 307, 307, 307, 290);



4. Now again Step 2 is repeated with 294 bytes. Now pipeline was 
{xxx.xxx.xxx.106:50010, xxx.xxx.xxx.xxx:10010}
Now file length becomes 7629. And stream is closed.



Here checksum will be verified by the Last DataNode in the pipeline for every 
packet received.
If verification fails then Exception will be thrown.

Since There is no exception in any of the DataNode Logs, Checksum verification 
should be successful. And meta file size should be 67 bytes.

Meta File should contain 15 checksum bytes and 7 header bytes. 7629/512=14 
checksums for full chunks and 1 checksum for partial chunk.



5. Now Again append to the same file is called, Now append fails because of the 
recovery failure at DataNodes due to below Exception.



java.io.IOException: Block blk_1329468764084_188363 is of size 7629 but has 17 
checksums and each checksum size is 4 bytes.
 at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.validateBlockMetadata(FSDataset.java:1922)
 at 
org.apache.hadoop.hdfs.server.datanode.FSDataset.startBlockRecovery(FSDataset.java:2142)
 at 
org.apache.hadoop.hdfs.server.datanode.DataNode.startBlockRecovery(DataNode.java:2078)
 at sun.reflect.GeneratedMethodAccessor29.invoke(Unknown Source)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:547)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1139)
 at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1135)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1133)




Here metafile length is 17*4+7=75 bytes. But it should be 67 bytes according to 
step 5.
Data block size in Step 4 and Step 5 are matching, but metafile sizes are not 
matching.





Thanks and Regards,
Vinayakumar B

Reply via email to