+1 for dropping On 09/04/2015 11:04 AM, Maximilian Michels wrote: > +1 for dropping Hadoop 2.2.0 binary and source-compatibility. The > release is hardly used and complicates the important high-availability > changes in Flink. > > On Fri, Sep 4, 2015 at 9:33 AM, Stephan Ewen <se...@apache.org> wrote: >> I am good with that as well. Mind that we are not only dropping a binary >> distribution for Hadoop 2.2.0, but also the source compatibility with 2.2.0. >> >> >> >> Lets also reconfigure Travis to test >> >> - Hadoop1 >> - Hadoop 2.3 >> - Hadoop 2.4 >> - Hadoop 2.6 >> - Hadoop 2.7 >> >> >> On Fri, Sep 4, 2015 at 6:19 AM, Chiwan Park <chiwanp...@apache.org> wrote: >>> >>> +1 for dropping Hadoop 2.2.0 >>> >>> Regards, >>> Chiwan Park >>> >>>> On Sep 4, 2015, at 5:58 AM, Ufuk Celebi <u...@apache.org> wrote: >>>> >>>> +1 to what Robert said. >>>> >>>> On Thursday, September 3, 2015, Robert Metzger <rmetz...@apache.org> >>>> wrote: >>>> I think most cloud providers moved beyond Hadoop 2.2.0. >>>> Google's Click-To-Deploy is on 2.4.1 >>>> AWS EMR is on 2.6.0 >>>> >>>> The situation for the distributions seems to be the following: >>>> MapR 4 uses Hadoop 2.4.0 (current is MapR 5) >>>> CDH 5.0 uses 2.3.0 (the current CDH release is 5.4) >>>> >>>> HDP 2.0 (October 2013) is using 2.2.0 >>>> HDP 2.1 (April 2014) uses 2.4.0 already >>>> >>>> So both vendors and cloud providers are multiple releases away from >>>> Hadoop 2.2.0. >>>> >>>> Spark does not offer a binary distribution lower than 2.3.0. >>>> >>>> In addition to that, I don't think that the HDFS client in 2.2.0 is >>>> really usable in production environments. Users were reporting >>>> ArrayIndexOutOfBounds exceptions for some jobs, I also had these exceptions >>>> sometimes. >>>> >>>> The easiest approach to resolve this issue would be (a) dropping the >>>> support for Hadoop 2.2.0 >>>> An alternative approach (b) would be: >>>> - ship a binary version for Hadoop 2.3.0 >>>> - make the source of Flink still compatible with 2.2.0, so that users >>>> can compile a Hadoop 2.2.0 version if needed. >>>> >>>> I would vote for approach (a). >>>> >>>> >>>> On Tue, Sep 1, 2015 at 5:01 PM, Till Rohrmann <trohrm...@apache.org> >>>> wrote: >>>> While working on high availability (HA) for Flink's YARN execution I >>>> stumbled across some limitations with Hadoop 2.2.0. From version 2.2.0 to >>>> 2.3.0, Hadoop introduced new functionality which is required for an >>>> efficient HA implementation. Therefore, I was wondering whether there is >>>> actually a need to support Hadoop 2.2.0. Is Hadoop 2.2.0 still actively >>>> used >>>> by someone? >>>> >>>> Cheers, >>>> Till >>>> >>> >>> >>> >>> >>> >>
signature.asc
Description: OpenPGP digital signature