Re: Retire 0.20-append branch?

2011-10-03 Thread Dhruba Borthakur
+1. -dhruba On Mon, Oct 3, 2011 at 2:50 PM, Todd Lipcon wrote: > Hi all, > > Now that the patches from 0.20-append have been merged into 0.20.205, > I think we should consider officially retiring this branch. > > In essence, that means that we would: > 1) Ask contributors to provide patches aga

Re: Merging some trunk changes to 23

2012-01-05 Thread Dhruba Borthakur
+1 for 1 adding "BR scalability (HDFS-395, HDFS-2477, HDFS-2495, HDFS-2476)" These are very significant performance improvements that would be very valuable in 0.23. -dhruba On Thu, Jan 5, 2012 at 7:52 PM, Eli Collins wrote: > Hey gang, > > I was looking at the difference between hdfs trunk a

Re: [DISCUSS] Remove append?

2012-03-22 Thread Dhruba Borthakur
I think "append" would be useful. But not precisely sure which applications would use it. I would vote to keep the code though and not remove it. -dhruba On Thu, Mar 22, 2012 at 5:49 PM, Eli Collins wrote: > On Thu, Mar 22, 2012 at 5:03 PM, Tsz Wo Sze wrote: > > @Eli, Removing a feature would

Re: Hudson build is back to normal: Hadoop-Hdfs-trunk #8

2009-06-30 Thread Dhruba Borthakur
Yes, Good stuff! thanks, dhruba On Tue, Jun 30, 2009 at 12:07 PM, Nigel Daley wrote: > Agreed, this is a very good thing. > > It's all our responsibility to watch it and keep it working :-) > > Nige > > On Jun 30, 2009, at 8:37 AM, Tsz Wo (Nicholas), Sze wrote: > > >> This is the first success

Re: hdfs build fails with latest trunks hadoop-core.jar

2009-07-16 Thread Dhruba Borthakur
Looking at it Dhruba On 7/16/09, Giridharan Kesavan wrote: > Hdfs build fails as we compile it with the latest hadoop-core-0.21.0-dev.jar > build from latest common/trunk > > compile-hdfs-classes: > [javac] Compiling 151 source files to > /home/gkesavan/hdfs-trunk/build/classes > [ja

Re: hdfs build fails with latest trunks hadoop-core.jar

2009-07-16 Thread Dhruba Borthakur
The hadoop-core-0.21-dev.jar is checked into the lib directory in hdfs trunk. This was a change I made to it last Sunday. svn log hadoop-core-0.21.0-dev.jar r793365 | dhruba | 2009-07-12 08:34:17 -0700 (Sun, 12 Jul 2009) | 3

Re: socket buffer sizes hardcoded

2009-09-04 Thread Dhruba Borthakur
Making it configurable seems like a good thing. There is a JIRA (owned by Sanjay) that describes that some of these configuration variables on the client side might become "undocumented"; tjois means that they might change semantics from a release to another. thanks, dhruba On Wed, Sep 2, 2009 a

Re: Tracking Replication errors

2009-09-09 Thread Dhruba Borthakur
when a block is being received by a datanode (either because of a replication request or from a client write), the datanode verifies crc. Also, the there is a thread in the datanode that periodically verifies crc of existing blocks. dhruba On Wed, Sep 9, 2009 at 7:27 PM, Brian Bockelman wrote:

Re: Tracking Replication errors

2009-09-09 Thread Dhruba Borthakur
bytes of data... this is too much of metadata for the NN to hold in memory. dhruba On Wed, Sep 9, 2009 at 8:33 PM, Brian Bockelman wrote: > > On Sep 9, 2009, at 10:25 PM, Dhruba Borthakur wrote: > > when a block is being received by a datanode (either because of a >> repl

Re: [VOTE] port HDFS-127 (DFSClient block read failures cause open DFSInputStream to become unusable) to hadoop 0.20/0.21

2009-10-19 Thread Dhruba Borthakur
+1. On Mon, Oct 19, 2009 at 3:55 PM, Chris Douglas wrote: > +1 This fix has been in limbo for a long time; thanks for finishing > it, Nicholas. -C > > On Mon, Oct 19, 2009 at 2:34 PM, Tsz Wo (Nicholas), Sze > wrote: > > DFSClient has a retry mechanism on block acquiring for read. If the > numbe

Re: Commit hdfs-630 to 0.21?

2009-12-14 Thread Dhruba Borthakur
If HDFS-630 is a blocker for hbase on small clusters, maybe we can target it for 0.21. Maybe you can run a VOTE for it? thanks, dhruba On Sat, Dec 12, 2009 at 3:54 PM, stack wrote: > HDFS-630 is kinda critical to us over in hbase. We'd like to get it into > 0.21 (Its been committed to TRUNK).

Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?

2010-01-22 Thread Dhruba Borthakur
> tweaked Hadoop to allow the datanodes to get the entire list are you referring to datanodes or dfs clients here? The client already gets the entire list of replica locations for a block from the namenode. and one could always develop a DFS client that is free to choose whatever locations it dec

Re: [VOTE -- Round 2] Commit hdfs-630 to 0.21?

2010-01-22 Thread Dhruba Borthakur
+1 for making this patch go into 0.21. thanks, dhruba On Fri, Jan 22, 2010 at 10:25 AM, Todd Lipcon wrote: > Hi Steve, > > All of the below may be good ideas, but I don't think they're relevant to > the discussion at hand. Specifically, none of them can enter 0.21 without a > vote as they'd be

Re: [VOTE] Commit hdfs-1024 to 0.20 branch

2010-04-03 Thread Dhruba Borthakur
+ 1 On 4/2/10, Stack wrote: > Please on committing HDFS-1024 to the hadoop 0.20 branch. > > Background: > > HDFS-1024 fixes possible trashing of fsimage because of failed copy > from 2NN and NN. Ordinarily, possible corruption of this proportion > would merit commit w/o need of a vote only Dhr

Re: HDFS VFS Driver

2010-06-16 Thread Dhruba Borthakur
hi mike, it will be nice to get a high level doc on what/how it is implemented. also, you might want to compare it with fufs-dfs http://wiki.apache.org/hadoop/MountableHDFS thanks, dhruba On Wed, Jun 16, 2010 at 8:55 AM, Michael D'Amour wrote: > We have an open source ETL tool (Kettle) which

Re: [VOTE] New branch for HDFS-1052 development

2010-08-19 Thread Dhruba Borthakur
+1 Sent from my iPhone On Aug 19, 2010, at 5:07 PM, Suresh Srinivas wrote: I am planning to create a new branch from the trunk for the work related to HDFS-1052 - HDFS scalability using multiple namenodes/ namespaces. Please see the jira for more details. Doing the development in a sep

Re: is dfsclient caches the data block to local disk before writing?

2010-09-26 Thread Dhruba Borthakur
DFS client does not write the data to local disk first. Instead, it streams data directly to the datanodes in the write pipeline. I will update the document. On Sun, Sep 26, 2010 at 5:21 AM, Gokulakannan M wrote: > Hi, > > > > Is staging still used in hdfs when writing the data? Thi

Re: Reason to store 64 block file in a sub directory?

2010-10-11 Thread Dhruba Borthakur
The number is just an adhoc number. The policy is not to put too many block files in the same directory because some local filesystems behave badly if the number of files in the same directory exceed a certain value. -dhruba On Mon, Oct 11, 2010 at 1:15 PM, Thanh Do wrote: > Hi all, > > can an

Review Request: Test review

2010-10-26 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13/ --- Review request for hadoop-hdfs and Dmytro Molkov. Summary --- Test Review req

Re: Review Request: Test review

2010-10-26 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/13/ --- (Updated 2010-10-26 17:27:35.304640) Review request for hadoop-hdfs and Dmytro Mol

Re: Review Request: Add listCorruptFileBlocks to DistributedFileSystem (and ClientProtocol)

2010-11-01 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/18/#review21 --- http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/had

Re: Review Request: Populate needed replication queues before leaving safe mode.

2010-11-16 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/105/#review39 --- http://svn.apache.org/repos/asf/hadoop/hdfs/trunk/src/java/org/apache/ha

Review Request: Ability to do savenamespace without being in safemode

2010-11-30 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- Review request for hadoop-hdfs. Summary --- The namenode need not be in safe

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-01 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-01 23:05:00.973158) Review request for hadoop-hdfs. Changes --

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-02 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-02 11:21:23.394822) Review request for hadoop-hdfs. Changes --

Re: Review Request: Ability to do savenamespace without being in safemode

2010-12-07 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/125/ --- (Updated 2010-12-07 11:01:28.866525) Review request for hadoop-hdfs. Changes --

Re: Good VLDB paper on WALs

2010-12-27 Thread Dhruba Borthakur
Hi Todd, Good paper, it would be nice to get the Flush-Pipelining technique (described in the paper) implemented in HBase and HDFS write-ahead logs. (I am CC-ing this to hdfs-...@hadoop as well) HDFS currently uses Hadoop RPC and the server thread blocks till the WAL is written to disk. In earlie

Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- Review request for hadoop-hdfs. Summary --- Make Datanode handle errors to n

Review Request: Make exiting safemode a lot faster.

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/196/ --- Review request for hadoop-hdfs. Summary --- Make exiting safemode a lot fast

Re: Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- (Updated 2010-12-28 13:55:18.599061) Review request for hadoop-hdfs. Changes --

Re: Review Request: Make Datanode handle errors to namenode.register call more elegantly

2010-12-28 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/195/ --- (Updated 2010-12-28 14:09:46.364223) Review request for hadoop-hdfs. Summary --

Re: Review Request: Make exiting safemode a lot faster.

2011-01-23 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/196/ --- (Updated 2011-01-23 20:24:11.634622) Review request for hadoop-hdfs. Changes --

Review Request: Allow a datanode to copy a block to a datanode on a foreign HDFS cluster

2011-01-24 Thread Dhruba Borthakur
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/346/ --- Review request for hadoop-hdfs. Summary --- This patch introduces an RPC to

Re: Use VIP for DataNodes

2011-01-25 Thread Dhruba Borthakur
Fb does not use the VIP approach, we tried that but quickly found out some limitations, one main problem being that the failover server pair has to be in the same subnet (for VIP to work). Instead we now use the AvatarNode integrated with Zookeeper. -dhruba On Tue, Jan 25, 2011 at 6:12 PM, Harsh

Re: Use VIP for DataNodes

2011-01-29 Thread Dhruba Borthakur
t; Thanks for sharing. Seems we only need to let NameNode & Backup Node in the > same subnet. Why it is a main problem? > > regards > macf > > On Wed, Jan 26, 2011 at 4:14 PM, Harsh J wrote: > > > Thank you for correcting that :) > > > > On Wed, J

Re: Merging Namenode Federation feature (HDFS-1052) to trunk

2011-03-14 Thread Dhruba Borthakur
Hi folks, The design for the federation work has been a published and there is a very well-written design document. It explains the pros-and-cons of each design point. It would be nice if more people can review this document and provide comments on how to make it better. The implementation is in p

Re: Branch for HDFS-1073 and related work

2011-03-28 Thread Dhruba Borthakur
+1. I think this will be very helpful in moving the design forward quickly. -dhruba On Mon, Mar 28, 2011 at 1:14 PM, Todd Lipcon wrote: > Hi all, > > I discussed this with a couple folks over the weekend who are involved in > the project, but wanted to let the dev community at large know: > >

Re: VOTE: Committing HADOOP-6949 to 0.22 branch

2011-03-28 Thread Dhruba Borthakur
This is a very effective optimization, +1 on pulling it to 0.22. -dhruba On Mon, Mar 28, 2011 at 9:39 PM, Konstantin Shvachko wrote: > HADOOP-6949 introduced a very important optimization to the RPC layer. > Based > on the benchmarks presented in HDFS-1583 this provides an order of > magnitude

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

2011-04-23 Thread Dhruba Borthakur
Given that we will be re-organizing the svn tree very soon and the fact that the design and most of the implementation is complete, let's merge it into trunk! -dhruba On Fri, Apr 22, 2011 at 9:48 AM, Suresh Srinivas wrote: > A few weeks ago, I had sent an email about the progress of HDFS federat

Re: [Discuss] Merge federation branch HDFS-1052 into trunk

2011-04-26 Thread Dhruba Borthakur
I feel that making the datanode talk to multiple namenodes is very valuable, especially when there is plenty of storage available on a single datanode machine (think 24 TB to 36 TB) and a single namenode does not have enough memory to hold all file metadata for such a large cluster in memory. This

Re: How to build facebook-hadoop?

2011-06-14 Thread Dhruba Borthakur
Hi Matt, This part of the code is not part of any Apache release, so you won't be able to get any support from this list. I would suggest that your best bet would be to participate in the github based mailing lists itself. Also, that code release is not a "distribution", so I won't be surprised if

[jira] Resolved: (HDFS-454) HDFS workflow in JIRA does not match MAPREDUCE, HADOOP

2009-07-01 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-454. --- Resolution: Fixed Fix Version/s: 0.21.0 Thanks to Owen for fixing this. > H

[jira] Created: (HDFS-487) HDFS should expose a fileid to uniquely identify a file

2009-07-10 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur HDFS should expose a id that uniquely identifies a file. This helps in developing applications that work correctly even when files are moved from one directory to another. A typical use-case is to make the Pluggable Block Placement

[jira] Created: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-07-24 Thread dhruba borthakur (JIRA)
Implement erasure coding as a layer on HDFS --- Key: HDFS-503 URL: https://issues.apache.org/jira/browse/HDFS-503 Project: Hadoop HDFS Issue Type: New Feature Reporter: dhruba borthakur

[jira] Created: (HDFS-532) Allow applications to know that a read failed beucase block is missing

2009-08-06 Thread dhruba borthakur (JIRA)
: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur I have an application that has intelligence to retrieve data from alternate locations if HDFS cannot provide this data. This can happen when data in HDFS is corrupted or the block is missing. HDFS already

[jira] Created: (HDFS-575) DFSClient read performance can be improved by stagerring connection setup to datanode(s)

2009-08-27 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: dhruba borthakur The DFS client opens a socket connection to a DN for the n-th block, fetches n-th block from that datanode and then opens socket connections to the datanode that contains

[jira] Created: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests

2009-09-05 Thread dhruba borthakur (JIRA)
: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The namenode processes RPC requests from clients that are reading/writing to files as well as heartbeats/block reports from datanodes. Sometime

[jira] Created: (HDFS-600) Support for pluggable erasure coding policy for HDFS

2009-09-05 Thread dhruba borthakur (JIRA)
: dhruba borthakur HDFS-503 introduces erasure coding for HDFS files. It currently uses "xor" algoritm as the Erasure coding algorithm. It would be nice if that Erasure Coding framework supports a pluggable API to allow plugging in other Erasure Coding policies. A few of these po

[jira] Created: (HDFS-611) Heartbeats times from Datanodes increase when there are plenty of blocks to delete

2009-09-10 Thread dhruba borthakur (JIRA)
HDFS Issue Type: Bug Components: data-node Reporter: dhruba borthakur Assignee: dhruba borthakur I am seeing that when we delete a large directory that has plenty of blocks, the heartbeat times from datanodes increase significantly from the normal value

[jira] Created: (HDFS-684) Use HAR filesystem to merge parity files

2009-10-07 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur The HDFS raid implementation (HDFS-503) creates a parity file for every file that is RAIDed. This puts additional burden on the memory requirements of the namenode. It will be nice if the parity files are combined together using the HadoopArchive (har) format

[jira] Created: (HDFS-695) RaidNode should red in configuration from hdfs-site.xml

2009-10-12 Thread dhruba borthakur (JIRA)
: contrib/raid Reporter: dhruba borthakur Assignee: dhruba borthakur The RaidNode currently reads in the configuration from hadoop-*.xml. It should read in its config parameters from hdfs-site.xml as well. -- This message is automatically generated by JIRA. - You can reply

[jira] Created: (HDFS-729) fsck option to list only corrupted files

2009-10-23 Thread dhruba borthakur (JIRA)
fsck option to list only corrupted files Key: HDFS-729 URL: https://issues.apache.org/jira/browse/HDFS-729 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur

[jira] Created: (HDFS-743) file size is fluctuating although file is closed

2009-10-29 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur Priority: Blocker I am seeing that the length of a file sometimes becomes zero after a namenode restart. These files have only one block. All the three replicas of that block on the datanode(s) has non-zero size

[jira] Created: (HDFS-751) TestCrcCorruption succeeds but is not testing anything of value

2009-11-03 Thread dhruba borthakur (JIRA)
Affects Versions: 0.21.0 Reporter: dhruba borthakur Assignee: dhruba borthakur The test is broken since th DataNode introduces rbr, rwr, FINALISED status of blocks. It tried to corrupt blocks in dfs/data/data1/current, does not find any blocks there and continues happily

[jira] Created: (HDFS-756) libhdfs unit tests do not run

2009-11-08 Thread dhruba borthakur (JIRA)
libhdfs unit tests do not run -- Key: HDFS-756 URL: https://issues.apache.org/jira/browse/HDFS-756 Project: Hadoop HDFS Issue Type: Bug Components: contrib/libhdfs Reporter: dhruba

[jira] Created: (HDFS-757) Unit tests failure for RAID

2009-11-09 Thread dhruba borthakur (JIRA)
Unit tests failure for RAID --- Key: HDFS-757 URL: https://issues.apache.org/jira/browse/HDFS-757 Project: Hadoop HDFS Issue Type: Bug Components: contrib/raid Reporter: dhruba borthakur

[jira] Resolved: (HDFS-503) Implement erasure coding as a layer on HDFS

2009-11-09 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-503. --- Resolution: Fixed I will fix the unit-test failure via HDFS-757. The unit tests failed when

[jira] Created: (HDFS-763) DataBlockScanner reporting of bad blocks is slightly misleading

2009-11-10 Thread dhruba borthakur (JIRA)
Components: data-node Affects Versions: 0.20.1 Reporter: dhruba borthakur Assignee: dhruba borthakur The Datanode generates a report of the period block scanning that verifies crcs. It reports something like the following: Scans since restart : 192266 Scan errors

[jira] Resolved: (HDFS-769) test-c++-libhdfs constantly fails

2009-11-13 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-769. --- Resolution: Duplicate duplicate of HDFS-756 > test-c++-libhdfs constantly fa

[jira] Created: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline

2009-12-10 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Improvement Components: hdfs client Reporter: dhruba borthakur Assignee: dhruba borthakur HDFS does not replicate the last block of the file that is being currently written to by an application. Every datanode

[jira] Created: (HDFS-839) The NameNode should forward block reports to BackupNode

2009-12-16 Thread dhruba borthakur (JIRA)
Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The BackupNode (via HADOOP-4539) receives a stream of transactions from NameNode. However, the BackupNode does not have block locations of blocks. It would be nice if the NameNode can forward all block

[jira] Created: (HDFS-841) error while checkpointing edits log gets ignored

2009-12-17 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur FSImage.rollEditLog is called to create a new checkpoint of the edits log. In turn, it calls incrementCheckpointTime, which in turn invokes setCheckpointTime. This method cathes and ignores the exception thrown by writeCheckpointTime(). I am assuming that this

[jira] Created: (HDFS-853) The HDFS webUI should show a metric that summarizes whether the cluster is balanced regarding disk space usage

2009-12-24 Thread dhruba borthakur (JIRA)
://issues.apache.org/jira/browse/HDFS-853 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur It is desirable to know how much the datanodes vary form one another in terms of space utilization to get a sense of how well a HDFS

[jira] Created: (HDFS-854) Datanode should scan devices in parallel to generate block report

2009-12-26 Thread dhruba borthakur (JIRA)
: Improvement Components: data-node Reporter: dhruba borthakur A Datanode should scan its disk devices in parallel so that the time to generate a block report is reduced. This will reduce the startup time of a cluster. A datanode has 12 disk (each of 1 TB) to store HDFS blocks. There

[jira] Created: (HDFS-855) namenode can save images in parallel to all directories in fs.name.dir

2009-12-27 Thread dhruba borthakur (JIRA)
: Improvement Components: name-node Reporter: dhruba borthakur The namenode restart times can be reduced if the namenode can save its image to multiple directories (specified in fs.name.dir) in parallel. The NN has a 6 GB fsimage and 1 MB edits file. The NN needed 10 minutes

[jira] Created: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

2010-01-13 Thread dhruba borthakur (JIRA)
: Improvement Components: hdfs client Reporter: dhruba borthakur In the current trunk, the HDFS client methods writeChunk() and hflush./sync are syncronized. This means that if a hflush/sync is in progress, an applicationn cannot write data to the HDFS client buffer. This reduces

[jira] Created: (HDFS-947) The namenode should redirect a hftp request to read a file to the datanode that has the maximum number of local replicas

2010-02-03 Thread dhruba borthakur (JIRA)
: https://issues.apache.org/jira/browse/HDFS-947 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur A client that uses the Hftp protocol to read a file is redirected by the namenode to a random datanode

[jira] Created: (HDFS-966) NameNode recovers lease even in safemode

2010-02-09 Thread dhruba borthakur (JIRA)
: dhruba borthakur The NameNode recovers a lease even when it is in safemode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.

[jira] Created: (HDFS-976) Hot Standby for NameNode

2010-02-13 Thread dhruba borthakur (JIRA)
Hot Standby for NameNode Key: HDFS-976 URL: https://issues.apache.org/jira/browse/HDFS-976 Project: Hadoop HDFS Issue Type: New Feature Components: name-node Reporter: dhruba borthakur

[jira] Created: (HDFS-978) Record every new block allocation of a file into the transaction log.

2010-02-14 Thread dhruba borthakur (JIRA)
: Improvement Components: name-node Reporter: dhruba borthakur HDFS should record every new block allocation (of a file) into its transaction logs. In the current code, block allocations are persisted only when a file is closed or hflush-ed. This feature will enable HDFS

[jira] Created: (HDFS-983) NameNode transaction log corruption with bad filename

2010-02-17 Thread dhruba borthakur (JIRA)
-node Affects Versions: 0.20.1 Reporter: dhruba borthakur The SecondaryNamenode is unable to create checkpoints. The stack trace is attached. This is the second time we have seen this issue. Both these occurances happened with unprintable characters in the filename. I am wondering

[jira] Created: (HDFS-988) saveNamespace can corrupt edits log

2010-02-18 Thread dhruba borthakur (JIRA)
saveNamespace can corrupt edits log --- Key: HDFS-988 URL: https://issues.apache.org/jira/browse/HDFS-988 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: dhruba

[jira] Resolved: (HDFS-87) NameNode startup fails if edit log terminates prematurely

2010-02-19 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-87?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-87. -- Resolution: Not A Problem > NameNode startup fails if edit log terminates prematur

[jira] Created: (HDFS-1024) SecondaryNamenode fails to checkpoint because namenode fails with CancelledKeyException

2010-03-05 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.2 Reporter: dhruba borthakur Assignee: Dmytro Molkov The secondary namenode fails to retrieve the entire fsimage from the Namenode. It fetches a part of the fsimage but believes that it has fetched the

[jira] Created: (HDFS-1031) Enhance the webUi to list a few of the corrupted files in HDFS

2010-03-08 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur The existing webUI displays something like this: WARNING : There are about 12 missing blocks. Please check the log or run fsck. It would be nice if we can display the filenames that have missing blocks. -- This message is automatically generated by JIRA

[jira] Created: (HDFS-1034) Enhance datanode to read data and checksum file in parallel

2010-03-11 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur In the current HDFS implementation, a read of a block issued to the datanode results in a disk access to the checksum file followed by a disk access to the checksum file. It would be nice to be able to do these two IOs in

[jira] Created: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

2010-03-29 Thread dhruba borthakur (JIRA)
Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories

[jira] Created: (HDFS-1093) Improve namenode scalability by splitting the FSNamesystem synchronized section in a read/write lock

2010-04-11 Thread dhruba borthakur (JIRA)
/HDFS-1093 Project: Hadoop HDFS Issue Type: Improvement Reporter: dhruba borthakur Assignee: dhruba borthakur Most critical data structures in the NameNode (NN) are protected by a syncronized methods in the FSNamesystem class. This essentially makes

[jira] Created: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss

2010-04-12 Thread dhruba borthakur (JIRA)
Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The current HDFS implementation specifies that the first replica is local and the other two replicas are on any two random nodes on a random remote rack. This means

[jira] Resolved: (HDFS-980) Convert FSNamesystem lock to ReadWriteLock

2010-04-13 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-980. --- Resolution: Duplicate Duplicate of HDFS-1093 > Convert FSNamesystem lock to ReadWriteL

[jira] Resolved: (HDFS-983) NameNode transaction log corruption with bad filename

2010-04-15 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-983. --- Resolution: Duplicate I have not seen this problem after we imported the patch from HADOOP

[jira] Created: (HDFS-1108) ability to create a file whose newly allocated blocks are automatically persisted immediately

2010-04-23 Thread dhruba borthakur (JIRA)
Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur The current HDFS design says that newly allocated blocks for a file are not persisted in the NN transaction log when the

[jira] Created: (HDFS-1129) Allow HDFS DataNodes and NameNode to have different build versions

2010-05-04 Thread dhruba borthakur (JIRA)
: Improvement Reporter: dhruba borthakur There are times when we want to deploy fixes to the NameNode without restarting the DataNodes. This reduces the restart time of the entire cluster. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to

[jira] Created: (HDFS-1145) When NameNode is shutdown it tries to exit safemode

2010-05-11 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Suppose the NameNode is in safemode. Then we try to shuut it down by invoking NameNode.stop(). The stop() method interrupts all waiting threads, which in turn, causes the SafeMode monitor to exit and thus triggering replicating/deleting of blocks

[jira] Created: (HDFS-1147) Reduce NN startup time by reducing the processing time of block reports

2010-05-11 Thread dhruba borthakur (JIRA)
Type: Improvement Components: name-node Reporter: dhruba borthakur The NameNode restart times are impacted to a large extent by the processing time of block reports. For a cluster with 150 millions blocks, the block report processing in the NN can take upto 20 minutes or so

[jira] Created: (HDFS-1162) Reduce the time required for a checkpoint

2010-05-18 Thread dhruba borthakur (JIRA)
Reporter: dhruba borthakur Assignee: dhruba borthakur The checkpoint time increases linearly with the number of files in the cluster. This is a problem with large clusters. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the

[jira] Created: (HDFS-1179) Implement a file change log

2010-05-27 Thread dhruba borthakur (JIRA)
Implement a file change log --- Key: HDFS-1179 URL: https://issues.apache.org/jira/browse/HDFS-1179 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur

[jira] Reopened: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers

2010-06-02 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur reopened HDFS-200: --- This has to be pulled into the branch-0.20-append branch. > In HDFS, sync() not yet guarant

[jira] Resolved: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers

2010-06-04 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-200. --- Resolution: Fixed I committed this into 0.20-append branch. > In HDFS, sync() not

[jira] Created: (HDFS-1200) The namenode could remember the last good location of a missing block

2010-06-10 Thread dhruba borthakur (JIRA)
: Improvement Components: name-node Reporter: dhruba borthakur There are times when datanodes die and all replicas of a block are lost. An fsck on the HDFS reports these as "MISSING" blocks in the filesystem. The administrator has to go mine through lots of nameno

[jira] Resolved: (HDFS-1054) Remove unnecessary sleep after failure in nextBlockOutputStream

2010-06-11 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1054. Fix Version/s: 0.21.0 (was: 0.20-append) Resolution: Fixed I

[jira] Resolved: (HDFS-1216) Update to JUnit 4 in branch 20 append

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1216. Resolution: Fixed I just committed this. Thanks Todd! > Update to JUnit 4 in branch

[jira] Resolved: (HDFS-142) In 0.20, move blocks being written into a blocksBeingWritten directory

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-142. --- Resolution: Fixed I have committed this. Thanks Sam, Nicolas and Todd. > In 0.20, move blo

[jira] Resolved: (HDFS-1141) completeFile does not check lease ownership

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1141. Resolution: Fixed Pulled into hadoop-0.20-append > completeFile does not check le

[jira] Resolved: (HDFS-1207) 0.20-append: stallReplicationWork should be volatile

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1207. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd

[jira] Resolved: (HDFS-1204) 0.20: Lease expiration should recover single files, not entire lease holder

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1204. Resolution: Fixed > 0.20: Lease expiration should recover single files, not entire le

[jira] Resolved: (HDFS-1210) DFSClient should log exception when block recovery fails

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1210. Fix Version/s: 0.20-append Resolution: Fixed I just committed this. Thanks Todd

[jira] Resolved: (HDFS-1211) 0.20 append: Block receiver should not log "rewind" packets at INFO level

2010-06-16 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1211. Resolution: Fixed I just committed this. Thanks Todd! > 0.20 append: Block receiver sho

[jira] Created: (HDFS-1254) 0.20: mark dfs.supprt.append to be true by default for the 0.20-append branch

2010-06-21 Thread dhruba borthakur (JIRA)
Issue Type: Bug Components: name-node Affects Versions: 0.20-append Reporter: dhruba borthakur Assignee: dhruba borthakur Fix For: 0.20-append The 0.20-append branch supports append/sync for HDFS. Change the default configuration to enable

[jira] Resolved: (HDFS-1254) 0.20: mark dfs.supprt.append to be true by default for the 0.20-append branch

2010-06-22 Thread dhruba borthakur (JIRA)
[ https://issues.apache.org/jira/browse/HDFS-1254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur resolved HDFS-1254. Hadoop Flags: [Reviewed] Resolution: Fixed I just committed this. > 0.20: m

  1   2   >