[jira] Created: (HDFS-1456) Provide builder for constructing instances of MiniDFSCluster

2010-10-13 Thread Jakob Homan (JIRA)
Provide builder for constructing instances of MiniDFSCluster Key: HDFS-1456 URL: https://issues.apache.org/jira/browse/HDFS-1456 Project: Hadoop HDFS Issue Type: Improvement

Re: DataBlockScanner scan period

2010-10-13 Thread Thanh Do
Oh, now i see the problem. The implication here is that some blocks might not be scanned for every long time, because the scanner may not finish scan all the blocks during 3 weeks, then after that, it start over again, ... Interesting, thanks for prompt reply, Brian. Thanh On Wed, Oct 13, 2010

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
On Oct 13, 2010, at 7:29 PM, Thanh Do wrote: > Hi Brian, > > If this is the case, then is there any chance that, > some how the DataBlockScanner cannot finishes > the verification for all the block in three weeks > (e.g, a node has a very large number of blocks)? > Yes. At some point, I'd rea

Re: DataBlockScanner scan period

2010-10-13 Thread Thanh Do
Hi Brian, If this is the case, then is there any chance that, some how the DataBlockScanner cannot finishes the verification for all the block in three weeks (e.g, a node has a very large number of blocks)? Thanh On Wed, Oct 13, 2010 at 7:18 PM, Brian Bockelman wrote: > Hi Thanh, > > That is co

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
Hi Thanh, That is correct. Last time I read the code, Hadoop scheduled the block verifications randomly throughout the period in order to avoid periodic effects (i.e., high load every N minutes). Brian On Oct 13, 2010, at 7:14 PM, Thanh Do wrote: > Brian, > > When you say *attempt* to compl

Re: DataBlockScanner scan period

2010-10-13 Thread Thanh Do
Brian, When you say *attempt* to complete and *entire* node scan, you mean for example, if a node has 100 block files, it will try to verify all 100 block every 3 weeks? That is in average, a block is scanned every (3 weeks / 100 time interval)? Thanks Thanh On Wed, Oct 13, 2010 at 7:07 PM, Bri

Re: DataBlockScanner scan period

2010-10-13 Thread Brian Bockelman
Hi Thanh, The scan period is the period that hadoop *attempts* to complete an entire node scan. That is, if it's set to 3 weeks, HDFS will try to scan each block once every 3 weeks. Obviously, depending on the bandwidth you have made available to the scanning thread, you can specify impossibl

DataBlockScanner scan period

2010-10-13 Thread Thanh Do
Hi again, Could any body explain to me about the scanning period policy of DataBlockScanner? That is who often it wake up and scan a block file. When looking at the code, I found static final long DEFAULT_SCAN_PERIOD_HOURS = 21*24L; // three weeks but definitely it does not wake up and pick a

[jira] Created: (HDFS-1455) Record DFS client/cli id with username/kerbros session token in audit log or hdfs client trace log

2010-10-13 Thread Eric Yang (JIRA)
Record DFS client/cli id with username/kerbros session token in audit log or hdfs client trace log -- Key: HDFS-1455 URL: https://issues.apache.org/jira/browse/HDFS-14

[jira] Resolved: (HDFS-1453) Need a command line option in RaidShell to fix blocks using raid

2010-10-13 Thread Ramkumar Vadali (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ramkumar Vadali resolved HDFS-1453. --- Resolution: Invalid RAID is a MR project, will reopen this under MR. > Need a command line op

[jira] Created: (HDFS-1454) Update the documentation to reflect true client caching strategy

2010-10-13 Thread Jeff Hammerbacher (JIRA)
Update the documentation to reflect true client caching strategy Key: HDFS-1454 URL: https://issues.apache.org/jira/browse/HDFS-1454 Project: Hadoop HDFS Issue Type: Improvemen