This change will be merged shortly for Spark 1.4, and has a minor implication for those creating their own Spark builds:
https://issues.apache.org/jira/browse/SPARK-7249 https://github.com/apache/spark/pull/5786 The default Hadoop dependency has actually been Hadoop 2.2 for some time, but the defaults weren't fully consistent as a Hadoop 2.2 build. That is what this resolves. The discussion highlights that it's actually not great to rely on the Hadoop defaults, if you care at all about the Hadoop binding, and that it's good practice to set some -Phadoop-x.y profile in any build. The net changes are: If you don't care about Hadoop at all, you could ignore this. You will get a consistent Hadoop 2.2 binding by default now. Still, you may wish to set a Hadoop profile. If you build for Hadoop 1, you need to set -Phadoop-1 now. If you build for Hadoop 2.2, you should still set -Phadoop-2.2 even though this is the default and is a no-op profile now. You can continue to set other Hadoop profiles and override hadoop.version; these are unaffected. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org