One reason I wouldn't change the default is that the Hadoop 2 launched by
spark-ec2 is not a full Hadoop 2 distribution -- Its more of a hybrid
Hadoop version built using CDH4 (it uses HDFS 2, but not YARN AFAIK).

Also our default Hadoop version in the Spark build is still 1.0.4 [1], so
it makes sense to stick to that in spark-ec2 as well ?

[1] https://github.com/apache/spark/blob/master/pom.xml#L122

Thanks
Shivaram

On Sun, Mar 1, 2015 at 2:59 PM, Nicholas Chammas <nicholas.cham...@gmail.com
> wrote:

>
> https://github.com/apache/spark/blob/fd8d283eeb98e310b1e85ef8c3a8af9e547ab5e0/ec2/spark_ec2.py#L162-L164
>
> Is there any reason we shouldn't update the default Hadoop major version in
> spark-ec2 to 2?
>
> Nick
>

Reply via email to