Hi,
Why is the checksum done for io.bytes.per.checksum (defaults to 512)
instead of the complete block at once (dfs.block.size defaults to
67108864)? If a block is corrupt then the entire block has to be
replicated anyway. Isn't it more efficient to do the checksum for
complete block at once
A smaller checksum interval decreases the overhead for random access.
If one seeks to a random location, one must, on average, read and
checksum an extra checksumInterval/2 bytes. 512 was chosen as a value
that, with four-byte CRC32, reduced the impact on small seeks while
increasing the storage a
Doing CRC32 on a huge data block also reduces its error detection
capability.
If you need more information on this topic, this paper will be a good
starting poing:
http://www.ece.cmu.edu/~koopman/networks/dsn02/dsn02_koopman.pdf
Kihwal
On 6/24/11 9:50 AM, "Doug Cutting" wrote:
> A smaller c
[
https://issues.apache.org/jira/browse/HDFS-2077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved HDFS-2077.
---
Resolution: Fixed
Hadoop Flags: [Reviewed]
Thanks. I added a comment in that area of the code be
[
https://issues.apache.org/jira/browse/HDFS-2078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved HDFS-2078.
---
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed to branch, thanks for reviewing, Eli.
> 10
Umbrella JIRA for separating block management and name space management in
NameNode
---
Key: HDFS-2106
URL: https://issues.apache.org/jira/browse/HDFS-2106
Project: Hadoo
Move block management code to a package
---
Key: HDFS-2107
URL: https://issues.apache.org/jira/browse/HDFS-2107
Project: Hadoop HDFS
Issue Type: Sub-task
Components: name-node
Reporte
[
https://issues.apache.org/jira/browse/HDFS-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved HDFS-2088.
---
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed to branch, thanks Eli
> Move edits log arc
[
https://issues.apache.org/jira/browse/HDFS-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved HDFS-2093.
---
Resolution: Fixed
Hadoop Flags: [Reviewed]
Committed to branch, thanks for review
> 1073: Handl
Move datanode heartbeat handling to BlockManager
Key: HDFS-2108
URL: https://issues.apache.org/jira/browse/HDFS-2108
Project: Hadoop HDFS
Issue Type: Sub-task
Components: name-node
10 matches
Mail list logo