[ https://issues.apache.org/jira/browse/HDFS-13056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16412449#comment-16412449 ]
Dennis Huo commented on HDFS-13056: ----------------------------------- Thanks for taking a look [~ste...@apache.org]! Applied your suggestions in [^HDFS-13056.010.patch]: -Mark getFileChecksumWithCombineMode as LimitedPrivate -Add TestCopyMapperCompositeCrc extending TestCopyMapper with differentiation of behaviors between the checksum options in terms of what kinds of file layouts are supported. -Remove String.format from some LOG.debug statements -Make ReplicatedFileChecksumComputer raise PathIOExceptions -Switch TestCrcUtil and TestCrcComposer to use LambdaTestUtils.intercept instead of junit ExpectedException There are a few places that will require followup work in LambdaTestUtils before switching over, namely: 1. Supporting checking more messages in the causal chain and/or suppressed exceptions, 2. Making it easy to check for multiple different string fragments in the exception text. I have some rudimentary parts of that in this followup: https://issues.apache.org/jira/browse/HDFS-13256 Re: [~xiaochen] - Removing or marking deprecated sounds good to me; I'll do that part in the followup Jira that also tracks adding WebHDFS support. Filed https://issues.apache.org/jira/browse/HDFS-13345 to track. Re: distcp, I agree there are some significant shortcomings in the existing behaviors; it's worse than always trying to overwrite when clusters have different configs, right now the copy does all the work and then fails on commit due to the checksum check after the copy has been performed, so it's wasted work. We could maybe file some followup work in DistCp to improve its behavior; as a user I initially expected its natural behavior to take into consideration whether FileChecksum#getAlgorithmName returns the same value for both sides before deciding if it's okay to compare them. I'd have expected mismatch algorithm names to be ignored the same way null FileChecksums are, so that syncing falls back to just file sizes. Instead, if the algorithm names differ right now, distcp tries to copy and then fails on commit. I guess we can discuss how to improve distcp semantics in a followup Jira. > Expose file-level composite CRCs in HDFS which are comparable across > different instances/layouts > ------------------------------------------------------------------------------------------------ > > Key: HDFS-13056 > URL: https://issues.apache.org/jira/browse/HDFS-13056 > Project: Hadoop HDFS > Issue Type: New Feature > Components: datanode, distcp, erasure-coding, federation, hdfs > Affects Versions: 3.0.0 > Reporter: Dennis Huo > Assignee: Dennis Huo > Priority: Major > Attachments: HDFS-13056-branch-2.8.001.patch, > HDFS-13056-branch-2.8.002.patch, HDFS-13056-branch-2.8.003.patch, > HDFS-13056-branch-2.8.004.patch, HDFS-13056-branch-2.8.005.patch, > HDFS-13056-branch-2.8.poc1.patch, HDFS-13056.001.patch, HDFS-13056.002.patch, > HDFS-13056.003.patch, HDFS-13056.003.patch, HDFS-13056.004.patch, > HDFS-13056.005.patch, HDFS-13056.006.patch, HDFS-13056.007.patch, > HDFS-13056.008.patch, HDFS-13056.009.patch, HDFS-13056.010.patch, > Reference_only_zhen_PPOC_hadoop2.6.X.diff, hdfs-file-composite-crc32-v1.pdf, > hdfs-file-composite-crc32-v2.pdf, hdfs-file-composite-crc32-v3.pdf > > > FileChecksum was first introduced in > [https://issues-test.apache.org/jira/browse/HADOOP-3981] and ever since then > has remained defined as MD5-of-MD5-of-CRC, where per-512-byte chunk CRCs are > already stored as part of datanode metadata, and the MD5 approach is used to > compute an aggregate value in a distributed manner, with individual datanodes > computing the MD5-of-CRCs per-block in parallel, and the HDFS client > computing the second-level MD5. > > A shortcoming of this approach which is often brought up is the fact that > this FileChecksum is sensitive to the internal block-size and chunk-size > configuration, and thus different HDFS files with different block/chunk > settings cannot be compared. More commonly, one might have different HDFS > clusters which use different block sizes, in which case any data migration > won't be able to use the FileChecksum for distcp's rsync functionality or for > verifying end-to-end data integrity (on top of low-level data integrity > checks applied at data transfer time). > > This was also revisited in https://issues.apache.org/jira/browse/HDFS-8430 > during the addition of checksum support for striped erasure-coded files; > while there was some discussion of using CRC composability, it still > ultimately settled on hierarchical MD5 approach, which also adds the problem > that checksums of basic replicated files are not comparable to striped files. > > This feature proposes to add a "COMPOSITE-CRC" FileChecksum type which uses > CRC composition to remain completely chunk/block agnostic, and allows > comparison between striped vs replicated files, between different HDFS > instances, and possible even between HDFS and other external storage systems. > This feature can also be added in-place to be compatible with existing block > metadata, and doesn't need to change the normal path of chunk verification, > so is minimally invasive. This also means even large preexisting HDFS > deployments could adopt this feature to retroactively sync data. A detailed > design document can be found here: > https://storage.googleapis.com/dennishuo/hdfs-file-composite-crc32-v1.pdf -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org