On 05/03/2015 13:05, "Alejandro Abdelnur" <tuc...@gmail.com<mailto:tuc...@gmail.com>> wrote:
IMO, if part of the community wants to take on the responsibility and work that takes to do a new major release, we should not discourage them from doing that. Having multiple major branches active is a standard practice. Looking @ 2.x, the major work (HDFS HA, YARN) meant that it did take a long time to get out, and during that time 0.21, 0.22, got released and ignored; 0.23 picked up and used in production. The 2.04-alpha release was more of a troublespot as it got picked up widely enough to be used in products, and changes were made between that alpha & 2.2 itself which raised compatibility issues. For 3.x I'd propose 1. Have less longevity of 3.x alpha/beta artifacts 2. Make clear there are no guarantees of compatibility from alpha/beta releases to shipping. Best effort, but not to the extent that it gets in the way. More succinctly: we will care more about seamless migration from 2.2+ to 3.x than from a 3.0-alpha to 3.3 production. 3. Anybody who ships code based on 3.x alpha/beta to recognise and accept policy (2). Hadoop's "instability guarantee" for the 3.x alpha/beta phase As well as backwards compatibility, we need to think about Forwards compatibility, with the goal being: Any app written/shipped with the 3.x release binaries (JAR and native) will work against a 3.y Hadoop release, for all x, y in Natural where y>=x and is-release(x) and is-release(y) That's important, as it means all server-side changes in 3.x which are expected to to mandate client-side updates: protocols, HDFS erasure decoding, security features, must be considered complete and stable before we can say is-release(x). In an ideal world, we'll even get the semantics right with tests to show this. Fixing classpath hell downstream is certainly one feature I am +1 on this roadmap is classpath isolation. But: it's only one of the features, and given there's not any design doc on that JIRA, way too immature to set a release schedule on. An alpha schedule with no-guarantees and a regular alpha roll, could be viable, as new features go in and can then be used to experimentally try this stuff in branches of Hbase (well volunteered, Stack!), etc. Of course instability guarantees will transitive This time around we are not replacing the guts as we did from Hadoop 1 to Hadoop 2, but superficial surgery to address issues were not considered (or was too much to take on top of the guts transplant). For the split brain concern, we did a great of job maintaining Hadoop 1 and Hadoop 2 until Hadoop 1 faded away. And a significant argument about 2.0.4-alpha to 2.2 protobuf/HDFS compatibility. Based on that experience I would say that the coexistence of Hadoop 2 and Hadoop 3 will be much less demanding/traumatic. The re-layout of all the source trees was a major change there, assuming there's no refactoring or switch of build tools then picking things back will be tractable Also, to facilitate the coexistence we should limit Java language features to Java 7 (even if the runtime is Java 8), once Java 7 is not used anymore we can remove this limitation. +1; setting javac.version will fix this What is nice about having java 8 as the base JVM is that it means you can be confident that all Hadoop 3 servers will be JDK8+, so downstream apps and libs can use all Java 8 features they want to. There's one policy change to consider there which is possibly, just possibly, we could allow new modules in hadoop-tools to adopt Java 8 languages early, provided everyone recognised that "backport to branch-2" isn't going to happen. -Steve