I'm +1 for a migrate to Java 8 as soon as possible. That's branch-2 & trunk, as having them on the same language level makes cherrypicking stuff off trunk possible. That's particularly the case for Java 8 as it is the first major change to the language since Java 5.
w.r.t shipping trunk as 3.x, it's going to take longer than planned. Hopefully not as long as the 2.x release process, but you never know. Which means I expect some more Hadoop 2 releases this year. We need to make the jump there too, get 2.7 out the door and include a roadmap in there to when the java 8+ only event happens across the codebase. -Steve ps. for anyone who wants a pure java8 build today, set -Djavac.version=1.8 on the classpath of a maven build. Last time I tried there were some (minor) bits of YARN that wouldn't compile... On 2 March 2015 at 18:31:00, Arun Murthy (a...@hortonworks.com<mailto:a...@hortonworks.com>) wrote: Andrew, Thanks for bringing up this discussion. I'm a little puzzled for I feel like we are rehashing the same discussion from last year - where we agreed on a different course of action w.r.t switch to JDK7. IAC, breaking compatibility for hadoop-3 is a pretty big cost - particularly for users such as Yahoo/Twitter/eBay who have several clusters between which compatibility is paramount. Now, breaking compatibility is perfectly fine over time where there is sufficient benefit e.g. HDFS HA or YARN in hadoop-2 (v/s hadoop-1). However, I'm struggling to quantify the benefit of hadoop-3 for users for the cost of the breakage. Given that we already agreed to put in JDK7 in 2.7, and that the classpath is a fairly minor irritant given some existing solutions (e.g. a new default classloader), how do you quantify the benefit for users? We could just do JDK8 in hadoop-2.10 or some such, you are definitely welcome to run the RM role for that release. Furthermore, I'm really concerned that this will be used as an opportunity to further break compat in more egregious ways. Also, are you foreseeing more compat breaks? OTOH, if we all agree that we should absolutely prevent compat breakages such as the client-server wire protocol, I feel the point of a major release is kinda lost. Overall, my biggest concern is the compatibility story vis-a-vis the benefit. Thoughts? thanks, Arun ________________________________________ From: Andrew Wang <andrew.w...@cloudera.com> Sent: Monday, March 02, 2015 3:19 PM To: common-...@hadoop.apache.org; mapreduce-...@hadoop.apache.org; hdfs-dev@hadoop.apache.org; yarn-...@hadoop.apache.org Subject: Looking to a Hadoop 3 release Hi devs, It's been a year and a half since 2.x went GA, and I think we're about due for a 3.x release. Notably, there are two incompatible changes I'd like to call out, that will have a tremendous positive impact for our users. First, classpath isolation being done at HADOOP-11656, which has been a long-standing request from many downstreams and Hadoop users. Second, bumping the source and target JDK version to JDK8 (related to HADOOP-11090), which is important since JDK7 is EOL in April 2015 (two months from now). In the past, we've had issues with our dependencies discontinuing support for old JDKs, so this will future-proof us. Between the two, we'll also have quite an opportunity to clean up and upgrade our dependencies, another common user and developer request. I'd like to propose that we start rolling a series of monthly-ish series of 3.0 alpha releases ASAP, with myself volunteering to take on the RM and other cat herding responsibilities. There are already quite a few changes slated for 3.0 besides the above (for instance the shell script rewrite) so there's already value in a 3.0 alpha, and the more time we give downstreams to integrate, the better. This opens up discussion about inclusion of other changes, but I'm hoping to freeze incompatible changes after maybe two alphas, do a beta (with no further incompat changes allowed), and then finally a 3.x GA. For those keeping track, that means a 3.x GA in about four months. I would also like to stress though that this is not intended to be a big bang release. For instance, it would be great if we could maintain wire compatibility between 2.x and 3.x, so rolling upgrades work. Keeping branch-2 and branch-3 similar also makes backports easier, since we're likely maintaining 2.x for a while yet. Please let me know any comments / concerns related to the above. If people are friendly to the idea, I'd like to cut a branch-3 and start working on the first alpha. Best, Andrew