Re: Usage of Hadoop 2.2.0

Chiwan Park Thu, 03 Sep 2015 21:20:23 -0700

+1 for dropping Hadoop 2.2.0

Regards,
Chiwan Park


> On Sep 4, 2015, at 5:58 AM, Ufuk Celebi <u...@apache.org> wrote:
> 
> +1 to what Robert said.
> 
> On Thursday, September 3, 2015, Robert Metzger <rmetz...@apache.org> wrote:
> I think most cloud providers moved beyond Hadoop 2.2.0.
> Google's Click-To-Deploy is on 2.4.1
> AWS EMR is on 2.6.0
> 
> The situation for the distributions seems to be the following:
> MapR 4 uses Hadoop 2.4.0 (current is MapR 5)
> CDH 5.0 uses 2.3.0 (the current CDH release is 5.4)
> 
> HDP 2.0  (October 2013) is using 2.2.0
> HDP 2.1 (April 2014) uses 2.4.0 already
> 
> So both vendors and cloud providers are multiple releases away from Hadoop 
> 2.2.0.
> 
> Spark does not offer a binary distribution lower than 2.3.0.
> 
> In addition to that, I don't think that the HDFS client in 2.2.0 is really 
> usable in production environments. Users were reporting ArrayIndexOutOfBounds 
> exceptions for some jobs, I also had these exceptions sometimes.
> 
> The easiest approach  to resolve this issue would be  (a) dropping the 
> support for Hadoop 2.2.0
> An alternative approach (b) would be:
>  - ship a binary version for Hadoop 2.3.0
>  - make the source of Flink still compatible with 2.2.0, so that users can 
> compile a Hadoop 2.2.0 version if needed.
> 
> I would vote for approach (a).
> 
> 
> On Tue, Sep 1, 2015 at 5:01 PM, Till Rohrmann <trohrm...@apache.org> wrote:
> While working on high availability (HA) for Flink's YARN execution I stumbled 
> across some limitations with Hadoop 2.2.0. From version 2.2.0 to 2.3.0, 
> Hadoop introduced new functionality which is required for an efficient HA 
> implementation. Therefore, I was wondering whether there is actually a need 
> to support Hadoop 2.2.0. Is Hadoop 2.2.0 still actively used by someone?
> 
> Cheers,
> Till
>

Re: Usage of Hadoop 2.2.0

Reply via email to