[ https://issues.apache.org/jira/browse/HDDS-11077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Rose resolved HDDS-11077. ------------------------------- Resolution: Done > Optimize checksum calculations in container merkle tree > ------------------------------------------------------- > > Key: HDDS-11077 > URL: https://issues.apache.org/jira/browse/HDDS-11077 > Project: Apache Ozone > Issue Type: Sub-task > Reporter: Ethan Rose > Assignee: Ethan Rose > Priority: Major > > *Choosing an Implementation* > There are two main places we can get our checksum implementations from: > * {{java.util.zip.CRC32[C\]}} which use native code. > * {{PureJavaCrc32[C\]}} which has implementations in Ozone, Hadoop, and > Apache Commons that are all more or less copied from each other. > The considerations in choosing an implementation are: > * CRC32C is a general improvement over CRC32. > * {{java.util.zip.CRC32C}} does not exist until Java 9. Java 8 only has > {{CRC32}}. > * {{java.util.Checksum#update(ByteBuffer)}} does not exist until Java 9. This > is why Ozone has the {{ChecksumByteBuffer}} wrapper class. > Previous work to determine which checksum to use on data in Ozone was done > [here|https://github.com/apache/ozone/pull/1910#issuecomment-775165462] and > [here|https://github.com/apache/ozone/pull/1950]. These links explain the > decision to default to {{java.util.zip.CRC32}} in Ozone. They also implement > the ability to swap between {{PureJavaCrc32C}} and {{java.util.zip.CRC32C}} > when CRC32C is specified based on the Java version. > *Choosing an update method* > It looks like {{java.util.Checksum#update(int)}} only reads the first byte > out of the int. This is based on the [Java 9 javadoc for > CRC32C|https://docs.oracle.com/javase%2F9%2Fdocs%2Fapi%2F%2F/java/util/zip/CRC32C.html#update-int-]. > Other implementations do not specify whether the whole int is read or not. > Since this is a single byte put, I'm not sure this is any better than using a > byte buffer/array to either roll the longs into the checksum one by one, or > batch the checksum computation on a buffer of all the longs under a tree node. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org