Rise of the Dragon

2016-03-19 Thread Colin P. McCabe
Who is creating all these dragon JIRAs? Colin

Re: CHANGES.txt is gone from trunk, branch-2, branch-2.8

2016-03-08 Thread Colin P. McCabe
+1 Thanks, Andrew. This will avoid so many spurious conflicts when cherry-picking changes, and so much wasted time on commit. best, Colin On Thu, Mar 3, 2016 at 9:11 PM, Andrew Wang wrote: > Hi all, > > With the inclusion of HADOOP-12651 going back to branch-2.8, CHANGES.txt > and release note

Re: Looking to a Hadoop 3 release

2016-02-22 Thread Colin P. McCabe
e released in 2.9. I think we > should rather concentrate our EC dev efforts to harden key features under > the follow-on umbrella HDFS-8031 and make it solid for a 3.0 release. > > Sincerely, > Zhe > > On Mon, Feb 22, 2016 at 9:25 AM Colin P. McCabe wrote: > >&g

Re: Looking to a Hadoop 3 release

2016-02-22 Thread Colin P. McCabe
+1 for a release of 3.0. There are a lot of significant, compatibility-breaking, but necessary changes in this release... we've touched on some of them in this thread. +1 for a parallel release of 2.8 as well. I think we are pretty close to this, barring a dozen or so blockers. best, Colin On

Re: Hadoop encryption module as Apache Chimera incubator project

2016-02-02 Thread Colin P. McCabe
It's great to see interest in improving this functionality. I think Chimera could be successful as an Apache project. I don't have a strong opinion one way or the other as to whether it belongs as part of Hadoop or separate. I do think there will be some challenges splitting this functionality o

Re: Jenkins stability and patching

2015-11-23 Thread Colin P. McCabe
On Mon, Nov 23, 2015 at 1:53 PM, Colin P. McCabe wrote: > I agree that our tests are in a bad state. It would help if we could > maintain a list of "flaky tests" somewhere in git and have Yetus > consider the flakiness of a test before -1ing a patch. Right now, we > pre

Re: Jenkins stability and patching

2015-11-23 Thread Colin P. McCabe
I agree that our tests are in a bad state. It would help if we could maintain a list of "flaky tests" somewhere in git and have Yetus consider the flakiness of a test before -1ing a patch. Right now, we pretty much all have that list in our heads, and we're not applying it very consistently. Hav

Re: Java 8 + Jersey updates

2015-10-26 Thread Colin P. McCabe
Looks like a good idea. I assume you are targetting this only at trunk / 3.0 based on the "target version" and the incompatibility discussion? best, Colin On Mon, Oct 26, 2015 at 7:07 AM, Tsuyoshi Ozawa wrote: > Hi Steve, > > Thanks for your help. > > > 2. it's "significant" > > This change in

Re: hadoop-hdfs-client splitoff is going to break code

2015-10-19 Thread Colin P. McCabe
Thanks for being proactive here, Steve. I think this is a good example of why this change should have been done in a branch rather than having been done directly in trunk. regards, Colin On Wed, Oct 14, 2015 at 10:36 AM, Steve Loughran wrote: > just an FYI, the split off of hadoop hdfs into c

Re: [DISCUSS] Looking to a 2.8.0 release

2015-10-05 Thread Colin P. McCabe
I think it makes sense to have a 2.8 release since there are a tremendous number of JIRAs in 2.8 that are not in 2.7. Doing a 3.x release seems like something we should consider separately since it would not have the same compatibility guarantees as a 2.8 release. There's a pretty big delta betwee

Re: INotify stability

2015-09-22 Thread Colin P. McCabe
Hi Mohammad, Like ATM said, HDFS-8965 is an important fix in this area. We have found that it prevents cases where INotify tries to read invalid sequences of bytes (sometimes because the edit log was truncated or corrupted; other times because it is in the middle of a write). HDFS-8964 fixes the

Re: Even after HDFS-2856 JSVC References are require..?

2015-09-14 Thread Colin P. McCabe
Has anyone measured the overhead of running SASL on DataTransferProtocol? I would expect it to be non-zero compared with simply running on a low port. The CPU overhead especially could regress performance on a typical Hadoop cluster. best, Colin On Thu, Sep 10, 2015 at 9:55 AM, Chris Nauroth w

Re: DISCUSS: is the order in FS.listStatus() required to be sorted?

2015-06-15 Thread Colin P. McCabe
On Mon, Jun 1, 2015 at 3:21 AM, Steve Loughran wrote: > > HADOOP-12009 (https://issues.apache.org/jira/browse/HADOOP-12009) patches the > FS javadoc and contract tests to say "the order you get things back from a > listStatus() isn't guaranteed to be alphanumerically sorted" > > That's one of th

Re: How to Tag a request using optional field

2015-06-15 Thread Colin P. McCabe
On Thu, Jun 4, 2015 at 2:46 PM, Rahul Shrivastava wrote: > Hi, > > > Suppose I write a Java client to create a directory on HDFS. Is there a way > to tag this request and get the tagged information on NameNode via > DFSInotifyEventInputStream or otherwise ? > > In short, is there a way to give opt

Re: Jenkins precommit-*-build

2015-05-05 Thread Colin P. McCabe
Thanks, Allen. This has long been a thorn in our side, and it's really good to see someone work on it. cheers, Colin On Tue, May 5, 2015 at 2:59 PM, Allen Wittenauer wrote: > TL;DR: > > Heads up: I’m going to hack on these scripts to fix the race > conditions. > > > > Pre

Re: [RESULT][VOTE] Release Apache Hadoop 2.7.0 RC0

2015-04-23 Thread Colin P. McCabe
Sorry for the late reply. It seems like the consensus is that we should push these fixes to 2.7.1. That works for me. HADOOP-11802 should be in there soon, hopefully the rest will follow quickly. best, Colin On Wed, Apr 22, 2015 at 4:27 PM, Vinod Kumar Vavilapalli wrote: > It took a while for

Re: Hadoop - Major releases

2015-03-17 Thread Colin P. McCabe
Thanks, Andrew and Joep. +1 for maintaining wire and API compatibility, but moving to JDK8 in 3.0 best, Colin On Mon, Mar 16, 2015 at 3:22 PM, Andrew Wang wrote: > I took the liberty of adding line breaks to Joep's mail. > > Thanks for the great feedback Joep. The goal with 3.x is to maintain A

Re: upstream jenkins build broken?

2015-03-16 Thread Colin P. McCabe
> changes >>>>> > >> in these test suites is HDFS-7722. That patch still looks fine >>>>> > though. I >>>>> > >> don¹t know if there are other uncommitted patches that changed these >>>>> > test >>>>> > >&

upstream jenkins build broken?

2015-03-10 Thread Colin P. McCabe
Hi all, A very quick (and not thorough) survey shows that I can't find any jenkins jobs that succeeded from the last 24 hours. Most of them seem to be failing with some variant of this message: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-clean-plugin:2.5:clean (default-clean) o

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-10 Thread Colin P. McCabe
way for the current > plan on hadoop-3.x right? So, I don't see the difference? > > Arun > > > From: Colin P. McCabe > Sent: Monday, March 09, 2015 3:05 PM > To: hdfs-dev@hadoop.apache.org > Cc: mapreduce-...@hadoop.apache

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-10 Thread Colin P. McCabe
Er, that should read "as Allen commented" C. On Tue, Mar 10, 2015 at 11:55 AM, Colin P. McCabe wrote: > Hi Arun, > > Not all changes which are incompatible can be "fixed"-- sometimes an > incompatibility is a necessary part of a change. For example, taking &

Re: Hadoop 3.x: what about shipping trunk as a 2.x release in 2015?

2015-03-09 Thread Colin P. McCabe
Java 7 will be end-of-lifed in April 2015. I think it would be unwise to plan a new Hadoop release against a version of Java that is almost obsolete and (soon) no longer receiving security updates. I think people will be willing to roll out a new version of Java for Hadoop 3.x. Similarly, the wh

Re: DISCUSSION: Patch commit criteria.

2015-03-02 Thread Colin P. McCabe
I agree with Andrew and Konst here. I don't think the language is unclear in the rule, either... "consensus with a minimum of one +1" clearly indicates that _other people_ are involved, not just one person. I would also mention that we created the "branch committer" role specifically to make it e

Re: TimSort bug and its workaround

2015-03-02 Thread Colin P. McCabe
Thanks for bringing this up. If you can find any place where an array might realistically be larger than 67 million elements, then I guess file a JIRA for it. Also this array needs to be of objects, not of primitives (quicksort is used for those in jdk7, apparently). I can't think of any such pl

Re: Erratic Jenkins behavior

2015-02-18 Thread Colin P. McCabe
eed to download dependencies > fresh every time. > > Chris Nauroth > Hortonworks > http://hortonworks.com/ > > > > > > > On 2/12/15, 2:00 PM, "Colin P. McCabe" wrote: > >>We could potentially use different .m2 directories for each executor. >>I t

Re: Theory question: good values for FileStatus.getBlockSize()

2015-02-17 Thread Colin P. McCabe
In the past, "block size" and "size of block N" were completely separate concepts in HDFS. The former was often referred to as "default block size" or "preferred block" size or some such thing. Basically it was the point at which we'd call it a day and move on to the next block, whenever any bloc

Re: Datanode synchronization is horrible. I’m thinking we can use ReentrantReadWriteLock for synchronization. What do you guys think?

2015-02-17 Thread Colin P. McCabe
In general, the DN does not perform reads from files under a big lock. We only need the lock for protecting the replica map and some of the block state. This lock hasn't really been a big problem in the past and I would hesitate to add complexity here (although I haven't thought about it that hard

Re: max concurrent connection to HDFS name node

2015-02-17 Thread Colin P. McCabe
Hi Demai, Nearly all input and output stream operations will talk directly to the DN without involving the NN. The NameNode is involved in metadata operations such as renaming or opening files, not in reading data. Hope this helps. best, Colin On Thu, Feb 12, 2015 at 4:21 PM, Demai Ni wrote:

Re: Erratic Jenkins behavior

2015-02-12 Thread Colin P. McCabe
of our missing class error issues. Colin On Tue, Feb 10, 2015 at 2:13 AM, Steve Loughran wrote: > Mvn is a dark mystery to us all. I wouldn't trust it not pick up things from > other builds if they ended up published to ~/.m2/repository during the process > > > > On 9 Febr

Re: Erratic Jenkins behavior

2015-02-09 Thread Colin P. McCabe
I'm sorry, I don't have any insight into this. With regard to HADOOP-11084, I thought that $BUILD_URL would be unique for each concurrent build, which would prevent build artifacts from getting mixed up between jobs. Based on the value of PATCHPROCESS that Kihwal posted, perhaps this is not the c

Re: NFSv3 Filesystem Connector

2015-01-14 Thread Colin P. McCabe
Hi Niels, I agree that direct-attached storage seems more economical for many users. As an HDFS developer, I certainly have a dog in this fight as well :) But we should be respectful towards people trying to contribute code to Hadoop and evaluate the code on its own merits. It is up to our users

Re: HDFS 2.6.0 upgrade ends with missing blocks

2015-01-08 Thread Colin P. McCabe
Hi dlmarion, In general, any upgrade process we do will consume disk space, because it's creating hardlinks and a new "current" directory, and so forth. So upgrading when disk space is very low is a bad idea in any scenario. It's certainly a good idea to free up some space before doing the upgrad