Re: [DISCUSS] Remove append?

2012-03-22 Thread Dhruba Borthakur
I think "append" would be useful. But not precisely sure which applications would use it. I would vote to keep the code though and not remove it. -dhruba On Thu, Mar 22, 2012 at 5:49 PM, Eli Collins wrote: > On Thu, Mar 22, 2012 at 5:03 PM, Tsz Wo Sze wrote: > > @Eli, Removing a feature would

[jira] [Created] (HDFS-3051) A zero-copy read api from FSDataInputStream

2012-03-06 Thread dhruba borthakur (Created) (JIRA)
Reporter: dhruba borthakur It will be nice if we can get a new API from FSDtaInputStream that allows for zero-copy read for hdfs readers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https

Re: Merging some trunk changes to 23

2012-01-05 Thread Dhruba Borthakur
+1 for 1 adding "BR scalability (HDFS-395, HDFS-2477, HDFS-2495, HDFS-2476)" These are very significant performance improvements that would be very valuable in 0.23. -dhruba On Thu, Jan 5, 2012 at 7:52 PM, Eli Collins wrote: > Hey gang, > > I was looking at the difference between hdfs trunk a

[jira] [Created] (HDFS-2744) Extend FSDataInputStream to allow fadvise

2012-01-03 Thread dhruba borthakur (Created) (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur Now that we have direct reads from local HDFS block files (HDFS-2246), it might make sense to make FSDataInputStream support fadvise calls. I have an application (HBase) that would like to tell the OS that it should not buffer

[jira] [Created] (HDFS-2699) Store data and checksums together in block file

2011-12-17 Thread dhruba borthakur (Created) (JIRA)
borthakur Assignee: dhruba borthakur The current implementation of HDFS stores the data in one block file and the metadata(checksum) in another block file. This means that every read from HDFS actually consumes two disk iops, one to the datafile and one to the checksum file. This is a

Re: Retire 0.20-append branch?

2011-10-03 Thread Dhruba Borthakur
+1. -dhruba On Mon, Oct 3, 2011 at 2:50 PM, Todd Lipcon wrote: > Hi all, > > Now that the patches from 0.20-append have been merged into 0.20.205, > I think we should consider officially retiring this branch. > > In essence, that means that we would: > 1) Ask contributors to provide patches aga

[jira] [Created] (HDFS-2128) Support for pluggable Trash policies

2011-07-05 Thread dhruba borthakur (JIRA)
Support for pluggable Trash policies Key: HDFS-2128 URL: https://issues.apache.org/jira/browse/HDFS-2128 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur

Re: How to build facebook-hadoop?

2011-06-14 Thread Dhruba Borthakur
Hi Matt, This part of the code is not part of any Apache release, so you won't be able to get any support from this list. I would suggest that your best bet would be to participate in the github based mailing lists itself. Also, that code release is not a "distribution", so I won't be surprised if

[jira] [Created] (HDFS-2006) ability to support storing extended attributes per file

2011-05-26 Thread dhruba borthakur (JIRA)
Components: name-node Reporter: dhruba borthakur It would be nice if HDFS provides a feature to store extended attributes for files, similar to the one described here: http://en.wikipedia.org/wiki/Extended_file_attributes. The challenge is that it has to be done in such a way that a site

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

2011-04-26 Thread Dhruba Borthakur
I feel that making the datanode talk to multiple namenodes is very valuable, especially when there is plenty of storage available on a single datanode machine (think 24 TB to 36 TB) and a single namenode does not have enough memory to hold all file metadata for such a large cluster in memory. This

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

2011-04-23 Thread Dhruba Borthakur
Given that we will be re-organizing the svn tree very soon and the fact that the design and most of the implementation is complete, let's merge it into trunk! -dhruba On Fri, Apr 22, 2011 at 9:48 AM, Suresh Srinivas wrote: > A few weeks ago, I had sent an email about the progress of HDFS federat

[jira] [Resolved] (HDFS-1392) Improve namenode scalability by prioritizing datanode heartbeats over block reports

2011-03-29 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1392. Resolution: Duplicate duplicate of HDFS-1541 > Improve namenode scalability

Re: VOTE: Committing HADOOP-6949 to 0.22 branch

2011-03-28 Thread Dhruba Borthakur
This is a very effective optimization, +1 on pulling it to 0.22. -dhruba On Mon, Mar 28, 2011 at 9:39 PM, Konstantin Shvachko wrote: > HADOOP-6949 introduced a very important optimization to the RPC layer. > Based > on the benchmarks presented in HDFS-1583 this provides an order of > magnitude

Re: Branch for HDFS-1073 and related work

2011-03-28 Thread Dhruba Borthakur
+1. I think this will be very helpful in moving the design forward quickly. -dhruba On Mon, Mar 28, 2011 at 1:14 PM, Todd Lipcon wrote: > Hi all, > > I discussed this with a couple folks over the weekend who are involved in > the project, but wanted to let the dev community at large know: > >

[jira] [Created] (HDFS-1783) Ability for HDFS client to write replicas in parallel

2011-03-24 Thread dhruba borthakur (JIRA)
Components: hdfs client Reporter: dhruba borthakur Assignee: dhruba borthakur The current implementation of HDFS pipelines the writes to the three replicas. This introduces some latency for realtime latency sensitive applications. An alternate implementation that allows the client

Re: Merging Namenode Federation feature (HDFS-1052) to trunk

2011-03-14 Thread Dhruba Borthakur
Hi folks, The design for the federation work has been a published and there is a very well-written design document. It explains the pros-and-cons of each design point. It would be nice if more people can review this document and provide comments on how to make it better. The implementation is in p

[jira] Created: (HDFS-1605) Convert DFSInputStream synchronized sections to a ReadWrite lock

2011-01-30 Thread dhruba borthakur (JIRA)
: Improvement Components: hdfs client Reporter: dhruba borthakur Assignee: dhruba borthakur Hbase does concurrent preads from multiple threads to different blocks of the same hdfs file. Each of these pread calls invoke DFSInputStream.getFileLength() and

Re: Use VIP for DataNodes

2011-01-29 Thread Dhruba Borthakur
t; Thanks for sharing. Seems we only need to let NameNode & Backup Node in the > same subnet. Why it is a main problem? > > regards > macf > > On Wed, Jan 26, 2011 at 4:14 PM, Harsh J wrote: > > > Thank you for correcting that :) > > > > On Wed, J

Re: Use VIP for DataNodes

2011-01-25 Thread Dhruba Borthakur
Fb does not use the VIP approach, we tried that but quickly found out some limitations, one main problem being that the failover server pair has to be in the same subnet (for VIP to work). Instead we now use the AvatarNode integrated with Zookeeper. -dhruba On Tue, Jan 25, 2011 at 6:12 PM, Harsh

Review Request: Allow a datanode to copy a block to a datanode on a foreign HDFS cluster

2011-01-24 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/346/ --- Review request for hadoop-hdfs. Summary --- This patch introduces an RPC to

[jira] Created: (HDFS-1593) Allow a datanode to copy a block to a datanode on a foreign HDFS cluster.

2011-01-24 Thread dhruba borthakur (JIRA)
Issue Type: Sub-task Reporter: dhruba borthakur Assignee: dhruba borthakur This patch introduces an RPC to the datanode to allow it to copy a block to a datanode on a remote HDFS cluster. -- This message is automatically generated by JIRA. - You can reply to this email to

Re: Review Request: Make exiting safemode a lot faster.

2011-01-23 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/196/ --- (Updated 2011-01-23 20:24:11.634622) Review request for hadoop-hdfs. Changes --

[jira] Created: (HDFS-1581) Metrics from DFSClient

2011-01-12 Thread dhruba borthakur (JIRA)
Metrics from DFSClient -- Key: HDFS-1581 URL: https://issues.apache.org/jira/browse/HDFS-1581 Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: dhruba borthakur We

[jira] Created: (HDFS-1567) DFSClient should retry reading from all datanodes in round robin fashion

2011-01-03 Thread dhruba borthakur (JIRA)
Type: Improvement Components: hdfs client Reporter: dhruba borthakur In the current implementation, the DFSClient retries the same datanode a few times (for reading) before marking the datanode as "dead" and moving on to trying the read-request from the next rep

[jira] Resolved: (HDFS-1563) create(file, true) appears to be violating atomicity

2011-01-02 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1563. Resolution: Won't Fix I vote that we keep the semantics of the create(overwrite==true

Re: Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- (Updated 2010-12-28 14:09:46.364223) Review request for hadoop-hdfs. Summary --

Re: Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- (Updated 2010-12-28 13:55:18.599061) Review request for hadoop-hdfs. Changes --

Review Request: Make exiting safemode a lot faster.

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/196/ --- Review request for hadoop-hdfs. Summary --- Make exiting safemode a lot fast

Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- Review request for hadoop-hdfs. Summary --- Make Datanode handle errors to n

Re: Good VLDB paper on WALs

2010-12-27 Thread Dhruba Borthakur
Hi Todd, Good paper, it would be nice to get the Flush-Pipelining technique (described in the paper) implemented in HBase and HDFS write-ahead logs. (I am CC-ing this to hdfs-...@hadoop as well) HDFS currently uses Hadoop RPC and the server thread blocks till the WAL is written to disk. In earlie

[jira] Created: (HDFS-1540) Make Datanode handle errors to namenode.register call more elegantly

2010-12-15 Thread dhruba borthakur (JIRA)
: Bug Reporter: dhruba borthakur Assignee: dhruba borthakur When the receives a "Connection reset by peer" from the namenode.register(), it exits. This causes many datanodes to die. -- This message is automatically generated by JIRA. - You can reply to this emai

[jira] Created: (HDFS-1539) prevent data loss when a cluster suffers a power loss

2010-12-15 Thread dhruba borthakur (JIRA)
Components: data-node, hdfs client, name-node Reporter: dhruba borthakur we have seen an instance where a external outage caused many datanodes to reboot at around the same time. This resulted in many corrupted blocks. These were recently written blocks; the current implementation of HDFS

[jira] Resolved: (HDFS-1520) HDFS 20 append: Lightweight NameNode operation to trigger lease recovery

2010-12-09 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1520. Resolution: Fixed Hadoop Flags: [Reviewed] I just committed this. Thanks Hairong

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-07 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-07 11:01:28.866525) Review request for hadoop-hdfs. Changes --

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-02 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-02 11:21:23.394822) Review request for hadoop-hdfs. Changes --

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-01 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-01 23:05:00.973158) Review request for hadoop-hdfs. Changes --

Review Request: Ability to do savenamespace without being in safemode

2010-11-30 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- Review request for hadoop-hdfs. Summary --- The namenode need not be in safe

[jira] Resolved: (HDFS-1162) Reduce the time required for a checkpoint

2010-11-23 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1162. Resolution: Won't Fix This has been fixed via various changes to the fsimage/edits

[jira] Created: (HDFS-1509) Resync discarded directories in fs.name.dir during saveNamespace command

2010-11-18 Thread dhruba borthakur (JIRA)
Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur In the current implementation, if the Namenode encounters an error while writing to a fs.name.dir directory it stops writing new edits to that directory. My proposal is to make the namenode write

[jira] Created: (HDFS-1508) Ability to do savenamespace without being in safemode

2010-11-18 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur In the current code, the administrator can run savenamespace only after putting the namenode in safemode. This means that applications that are writing to HDFS encounters errors because the NN is in safemode. We would like to allow

Re: Review Request: Populate needed replication queues before leaving safe mode.

2010-11-16 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/105/#review39 --- http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/ha

[jira] Created: (HDFS-1501) The logic that makes namenode exit safemode should be pluggable

2010-11-16 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: Patrick Kling HDFS RAID creates parity blocks for data blocks. So, even if all replicas of a block is missing, it is possible ro recreate it from the parity blocks. Thus, when the namenode restarts, it should use a different RAID

Re: Review Request: Add listCorruptFileBlocks to DistributedFileSystem (and ClientProtocol)

2010-11-01 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18/#review21 --- http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/had

Re: Review Request: Test review

2010-10-26 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13/ --- (Updated 2010-10-26 17:27:35.304640) Review request for hadoop-hdfs and Dmytro Mol

Review Request: Test review

2010-10-26 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13/ --- Review request for hadoop-hdfs and Dmytro Molkov. Summary --- Test Review req

[jira] Created: (HDFS-1463) accessTime updates should not occur in safeMode

2010-10-19 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur FSNamesystem.getBlockLocations sometimes need to update the accessTime of files. If the namenode is in safemode, this call should fail. -- This message is automatically generated by JIRA. - You can reply to this email to add

Re: Reason to store 64 block file in a sub directory?

2010-10-11 Thread Dhruba Borthakur
The number is just an adhoc number. The policy is not to put too many block files in the same directory because some local filesystems behave badly if the number of files in the same directory exceed a certain value. -dhruba On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do wrote: > Hi all, > > can an

[jira] Created: (HDFS-1438) Expose getCorruptFiles APi via the FileSystem API

2010-10-02 Thread dhruba borthakur (JIRA)
client, name-node Reporter: dhruba borthakur HDFS- proposed that the getCorruptFiles API be exposed via the DistributedFileSystem API. Details here http://goo.gl/IHL5 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue

[jira] Created: (HDFS-1432) HDFS across data centers: HighTide

2010-09-29 Thread dhruba borthakur (JIRA)
HDFS across data centers: HighTide -- Key: HDFS-1432 URL: https://issues.apache.org/jira/browse/HDFS-1432 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur

Re: is dfsclient caches the data block to local disk before writing?

2010-09-26 Thread Dhruba Borthakur
DFS client does not write the data to local disk first. Instead, it streams data directly to the datanodes in the write pipeline. I will update the document. On Sun, Sep 26, 2010 at 5:21 AM, Gokulakannan M wrote: > Hi, > > > > Is staging still used in hdfs when writing the data? Thi

[jira] Resolved: (HDFS-1147) Reduce NN startup time by reducing the processing time of block reports

2010-09-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1147. Resolution: Duplicate > Reduce NN startup time by reducing the processing time of bl

[jira] Resolved: (HDFS-1393) When a directory with huge number of files is deleted, the NN becomes unresponsive

2010-09-12 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1393. Resolution: Duplicate duplicate of HDFS-1143 > When a directory with huge number of fi

[jira] Created: (HDFS-1393) When a directory with huge number of files is deleted, the NN becomes unresponsive

2010-09-11 Thread dhruba borthakur (JIRA)
HDFS Issue Type: Bug Components: name-node Reporter: dhruba borthakur If one deletes a directory with about 2 million files, the namenode becomes unresponsive for close to 20 seconds. The reason being that the FSnamesystem lock is held while all the blocks

[jira] Created: (HDFS-1392) Improve namenode scalability by prioritizing datanode heartbeats over block reports

2010-09-10 Thread dhruba borthakur (JIRA)
: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur When a namenode restarts, it gets heartbeats followed by a block reports from the datanodes. The block report processing is heavyweight and can

[jira] Created: (HDFS-1391) Exiting safemode takes a long time when there are lots of blocks in the HDFS

2010-09-10 Thread dhruba borthakur (JIRA)
Issue Type: Bug Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur When the namenode decides to exit safemode, it acquires the FSNamesystem lock and then iterates over all blocks in the blocksmap to determine if any block has any

[jira] Resolved: (HDFS-1384) NameNode should give client the first node in the pipeline from different rack other than that of excludedNodes list in the same rack.

2010-09-10 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1384. Resolution: Duplicate This bug has been fixed in trunk because the client sends the

[jira] Created: (HDFS-1366) reduce namenode startup time by optimising checkBlockInfo while loading fsimage

2010-09-01 Thread dhruba borthakur (JIRA)
HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur The namenode spends about 10 minutes reading in a 14 GB fsimage file into memory and creating all the in-memory data structures. A jstack based debugger clearly shows that most of the

Re: [VOTE] New branch for HDFS-1052 development

2010-08-19 Thread Dhruba Borthakur
+1 Sent from my iPhone On Aug 19, 2010, at 5:07 PM, Suresh Srinivas wrote: I am planning to create a new branch from the trunk for the work related to HDFS-1052 - HDFS scalability using multiple namenodes/ namespaces. Please see the jira for more details. Doing the development in a sep

[jira] Created: (HDFS-1295) Improve namenode restart times by short-circuiting the first block reports from datanodes

2010-07-12 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.22.0 The namenode restart is dominated by the performance of processing block reports. On a 2000 node cluster

[jira] Created: (HDFS-1274) ability to send replication traffic on a separate port to the Datanode

2010-06-29 Thread dhruba borthakur (JIRA)
Type: Improvement Reporter: dhruba borthakur The datanode receives data from a client write request or from a replication request. It is useful to configure the cluster to so that dedicated bandwidth is allocated for client writes and replication traffic. This requires that the client

[jira] Resolved: (HDFS-81) Impact in NameNode scalability because heartbeat processing acquires the global lock

2010-06-23 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-81?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-81. -- Resolution: Won't Fix This problem occurs only in old releases and is not present in

[jira] Resolved: (HDFS-1254) 0.20: mark dfs.supprt.append to be true by default for the 0.20-append branch

2010-06-22 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1254. Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. > 0.20: m

[jira] Created: (HDFS-1254) 0.20: mark dfs.supprt.append to be true by default for the 0.20-append branch

2010-06-21 Thread dhruba borthakur (JIRA)
Issue Type: Bug Components: name-node Affects Versions: 0.20-append Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.20-append The 0.20-append branch supports append/sync for HDFS. Change the default configuration to enable

[jira] Resolved: (HDFS-1211) 0.20 append: Block receiver should not log "rewind" packets at INFO level

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1211. Resolution: Fixed I just committed this. Thanks Todd! > 0.20 append: Block receiver sho

[jira] Resolved: (HDFS-1210) DFSClient should log exception when block recovery fails

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1210. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd

[jira] Resolved: (HDFS-1204) 0.20: Lease expiration should recover single files, not entire lease holder

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1204. Resolution: Fixed > 0.20: Lease expiration should recover single files, not entire le

[jira] Resolved: (HDFS-1207) 0.20-append: stallReplicationWork should be volatile

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1207. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd

[jira] Resolved: (HDFS-1141) completeFile does not check lease ownership

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1141. Resolution: Fixed Pulled into hadoop-0.20-append > completeFile does not check le

[jira] Resolved: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-142. --- Resolution: Fixed I have committed this. Thanks Sam, Nicolas and Todd. > In 0.20, move blo

[jira] Resolved: (HDFS-1216) Update to JUnit 4 in branch 20 append

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1216. Resolution: Fixed I just committed this. Thanks Todd! > Update to JUnit 4 in branch

Re: HDFS VFS Driver

2010-06-16 Thread Dhruba Borthakur
hi mike, it will be nice to get a high level doc on what/how it is implemented. also, you might want to compare it with fufs-dfs http://wiki.apache.org/hadoop/MountableHDFS thanks, dhruba On Wed, Jun 16, 2010 at 8:55 AM, Michael D'Amour wrote: > We have an open source ETL tool (Kettle) which

[jira] Resolved: (HDFS-1054) Remove unnecessary sleep after failure in nextBlockOutputStream

2010-06-11 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1054. Fix Version/s: 0.21.0 (was: 0.20-append) Resolution: Fixed I

[jira] Created: (HDFS-1200) The namenode could remember the last good location of a missing block

2010-06-10 Thread dhruba borthakur (JIRA)
: Improvement Components: name-node Reporter: dhruba borthakur There are times when datanodes die and all replicas of a block are lost. An fsck on the HDFS reports these as "MISSING" blocks in the filesystem. The administrator has to go mine through lots of nameno

[jira] Resolved: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers

2010-06-04 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-200. --- Resolution: Fixed I committed this into 0.20-append branch. > In HDFS, sync() not

[jira] Reopened: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers

2010-06-02 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur reopened HDFS-200: --- This has to be pulled into the branch-0.20-append branch. > In HDFS, sync() not yet guarant

[jira] Created: (HDFS-1179) Implement a file change log

2010-05-27 Thread dhruba borthakur (JIRA)
Implement a file change log --- Key: HDFS-1179 URL: https://issues.apache.org/jira/browse/HDFS-1179 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur

[jira] Created: (HDFS-1162) Reduce the time required for a checkpoint

2010-05-18 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur The checkpoint time increases linearly with the number of files in the cluster. This is a problem with large clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the

[jira] Created: (HDFS-1147) Reduce NN startup time by reducing the processing time of block reports

2010-05-11 Thread dhruba borthakur (JIRA)
Type: Improvement Components: name-node Reporter: dhruba borthakur The NameNode restart times are impacted to a large extent by the processing time of block reports. For a cluster with 150 millions blocks, the block report processing in the NN can take upto 20 minutes or so

[jira] Created: (HDFS-1145) When NameNode is shutdown it tries to exit safemode

2010-05-11 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Suppose the NameNode is in safemode. Then we try to shuut it down by invoking NameNode.stop(). The stop() method interrupts all waiting threads, which in turn, causes the SafeMode monitor to exit and thus triggering replicating/deleting of blocks

[jira] Created: (HDFS-1129) Allow HDFS DataNodes and NameNode to have different build versions

2010-05-04 Thread dhruba borthakur (JIRA)
: Improvement Reporter: dhruba borthakur There are times when we want to deploy fixes to the NameNode without restarting the DataNodes. This reduces the restart time of the entire cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to

[jira] Created: (HDFS-1108) ability to create a file whose newly allocated blocks are automatically persisted immediately

2010-04-23 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the

[jira] Resolved: (HDFS-983) NameNode transaction log corruption with bad filename

2010-04-15 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-983. --- Resolution: Duplicate I have not seen this problem after we imported the patch from HADOOP

[jira] Resolved: (HDFS-980) Convert FSNamesystem lock to ReadWriteLock

2010-04-13 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-980. --- Resolution: Duplicate Duplicate of HDFS-1093 > Convert FSNamesystem lock to ReadWriteL

[jira] Created: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss

2010-04-12 Thread dhruba borthakur (JIRA)
Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The current HDFS implementation specifies that the first replica is local and the other two replicas are on any two random nodes on a random remote rack. This means

[jira] Created: (HDFS-1093) Improve namenode scalability by splitting the FSNamesystem synchronized section in a read/write lock

2010-04-11 Thread dhruba borthakur (JIRA)
/HDFS-1093 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Most critical data structures in the NameNode (NN) are protected by a syncronized methods in the FSNamesystem class. This essentially makes

Re: [VOTE] Commit hdfs-1024 to 0.20 branch

2010-04-03 Thread Dhruba Borthakur
+ 1 On 4/2/10, Stack wrote: > Please on committing HDFS-1024 to the hadoop 0.20 branch. > > Background: > > HDFS-1024 fixes possible trashing of fsimage because of failed copy > from 2NN and NN. Ordinarily, possible corruption of this proportion > would merit commit w/o need of a vote only Dhr

[jira] Created: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-03-29 Thread dhruba borthakur (JIRA)
Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories

[jira] Created: (HDFS-1034) Enhance datanode to read data and checksum file in parallel

2010-03-11 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur In the current HDFS implementation, a read of a block issued to the datanode results in a disk access to the checksum file followed by a disk access to the checksum file. It would be nice to be able to do these two IOs in

[jira] Created: (HDFS-1031) Enhance the webUi to list a few of the corrupted files in HDFS

2010-03-08 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur The existing webUI displays something like this: WARNING : There are about 12 missing blocks. Please check the log or run fsck. It would be nice if we can display the filenames that have missing blocks. -- This message is automatically generated by JIRA

[jira] Created: (HDFS-1024) SecondaryNamenode fails to checkpoint because namenode fails with CancelledKeyException

2010-03-05 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2 Reporter: dhruba borthakur Assignee: Dmytro Molkov The secondary namenode fails to retrieve the entire fsimage from the Namenode. It fetches a part of the fsimage but believes that it has fetched the

[jira] Resolved: (HDFS-87) NameNode startup fails if edit log terminates prematurely

2010-02-19 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-87. -- Resolution: Not A Problem > NameNode startup fails if edit log terminates prematur

[jira] Created: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread dhruba borthakur (JIRA)
saveNamespace can corrupt edits log --- Key: HDFS-988 URL: https://issues.apache.org/jira/browse/HDFS-988 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: dhruba

[jira] Created: (HDFS-983) NameNode transaction log corruption with bad filename

2010-02-17 Thread dhruba borthakur (JIRA)
-node Affects Versions: 0.20.1 Reporter: dhruba borthakur The SecondaryNamenode is unable to create checkpoints. The stack trace is attached. This is the second time we have seen this issue. Both these occurances happened with unprintable characters in the filename. I am wondering

[jira] Created: (HDFS-978) Record every new block allocation of a file into the transaction log.

2010-02-14 Thread dhruba borthakur (JIRA)
: Improvement Components: name-node Reporter: dhruba borthakur HDFS should record every new block allocation (of a file) into its transaction logs. In the current code, block allocations are persisted only when a file is closed or hflush-ed. This feature will enable HDFS

[jira] Created: (HDFS-976) Hot Standby for NameNode

2010-02-13 Thread dhruba borthakur (JIRA)
Hot Standby for NameNode Key: HDFS-976 URL: https://issues.apache.org/jira/browse/HDFS-976 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Reporter: dhruba borthakur

[jira] Created: (HDFS-966) NameNode recovers lease even in safemode

2010-02-09 Thread dhruba borthakur (JIRA)
: dhruba borthakur The NameNode recovers a lease even when it is in safemode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-947) The namenode should redirect a hftp request to read a file to the datanode that has the maximum number of local replicas

2010-02-03 Thread dhruba borthakur (JIRA)
: https://issues.apache.org/jira/browse/HDFS-947 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur A client that uses the Hftp protocol to read a file is redirected by the namenode to a random datanode

Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?

2010-01-22 Thread Dhruba Borthakur
+1 for making this patch go into 0.21. thanks, dhruba On Fri, Jan 22, 2010 at 10:25 AM, Todd Lipcon wrote: > Hi Steve, > > All of the below may be good ideas, but I don't think they're relevant to > the discussion at hand. Specifically, none of them can enter 0.21 without a > vote as they'd be

Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?

2010-01-22 Thread Dhruba Borthakur
> tweaked Hadoop to allow the datanodes to get the entire list are you referring to datanodes or dfs clients here? The client already gets the entire list of replica locations for a block from the namenode. and one could always develop a DFS client that is free to choose whatever locations it dec

[jira] Created: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

2010-01-13 Thread dhruba borthakur (JIRA)
: Improvement Components: hdfs client Reporter: dhruba borthakur In the current trunk, the HDFS client methods writeChunk() and hflush./sync are syncronized. This means that if a hflush/sync is in progress, an applicationn cannot write data to the HDFS client buffer. This reduces

  1   2   >