[ 
https://issues.apache.org/jira/browse/HBASE-11625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275084#comment-15275084
 ] 

Appy commented on HBASE-11625:
------------------------------

Uploading the patch.
Testing:
The bug being fixed here happens only when the data is actually corrupted.
We already have tests which 'simulate' checksum failure, i.e. when a checksum 
request comes, 
[this|https://github.com/apache/hbase/blob/513ca3483f1d32450ffa0c034e7a7f97b63ff582/hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestChecksum.java#L347]
 simply returns false. But this is not sufficient, consider this example.
Say the correct logic order for something is A --> B,  but the code actually 
has B --> A. Since we 'simulate' the failure exactly at point A, the test 
doesn't care about position of B relative to A. Instead if the data was 
corrupted for real, we would have seen an unexpected crash at B in the buggy 
case (while expecting crash at A) and caught this earlier.
The test change does exactly that. The first output below using the test on 
current master, and second output is with the patch.
{noformat}
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hbase.io.hfile.TestChecksum
Tests run: 4, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 1.538 sec <<< 
FAILURE! - in org.apache.hadoop.hbase.io.hfile.TestChecksum
testChecksumCorruption(org.apache.hadoop.hbase.io.hfile.TestChecksum)  Time 
elapsed: 0.048 sec  <<< ERROR!
java.io.IOException: Invalid HFile block magic: D\x00TABLK*
        at org.apache.hadoop.hbase.io.hfile.BlockType.parse(BlockType.java:159)
        at org.apache.hadoop.hbase.io.hfile.BlockType.read(BlockType.java:172)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock.<init>(HFileBlock.java:337)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1695)
        at 
org.apache.hadoop.hbase.io.hfile.TestChecksum$CorruptedFSReaderImpl.readBlockDataInternal(TestChecksum.java:372)
        at 
org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1527)
        at 
org.apache.hadoop.hbase.io.hfile.TestChecksum.testChecksumCorruptionInternals(TestChecksum.java:197)
        at 
org.apache.hadoop.hbase.io.hfile.TestChecksum.testChecksumCorruption(TestChecksum.java:152)


Results :

Tests in error:
  TestChecksum.testChecksumCorruption:152->testChecksumCorruptionInternals:197 
ยป IO
{noformat}

{noformat}
-------------------------------------------------------
 T E S T S
-------------------------------------------------------
Running org.apache.hadoop.hbase.io.hfile.TestChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.862 sec - in 
org.apache.hadoop.hbase.io.hfile.TestChecksum

Results :

Tests run: 4, Failures: 0, Errors: 0, Skipped: 0
{noformat}

Note:
On local filesystem (specifying path as file:///....), we don't enable hbase 
checksum. So simple {{hbase HFIle -p -f ...}} doesn't work. This actually makes 
the last repro method meaningless. Ref 
[1|https://github.com/apache/hbase/blob/8ace5bbfcea01e02c5661f75fe9458e04fa3b60f/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/HFileSystem.java#L117]
 and 
[2|https://github.com/apache/hbase/blob/8ace5bbfcea01e02c5661f75fe9458e04fa3b60f/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java#L542]



> Reading datablock throws "Invalid HFile block magic" and can not switch to 
> hdfs checksum 
> -----------------------------------------------------------------------------------------
>
>                 Key: HBASE-11625
>                 URL: https://issues.apache.org/jira/browse/HBASE-11625
>             Project: HBase
>          Issue Type: Bug
>          Components: HFile
>    Affects Versions: 0.94.21, 0.98.4, 0.98.5, 1.0.1.1, 1.0.3
>            Reporter: qian wang
>            Assignee: Pankaj Kumar
>             Fix For: 2.0.0
>
>         Attachments: 2711de1fdf73419d9f8afc6a8b86ce64.gz, HBASE-11625.patch, 
> correct-hfile, corrupted-header-hfile
>
>
> when using hbase checksum,call readBlockDataInternal() in hfileblock.java, it 
> could happen file corruption but it only can switch to hdfs checksum 
> inputstream till validateBlockChecksum(). If the datablock's header corrupted 
> when b = new HFileBlock(),it throws the exception "Invalid HFile block magic" 
> and the rpc call fail



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to