RE: PySpark 1.2 Hadoop version mismatch

2015-02-12 Thread Michael Nazario
...@cloudera.com] Sent: Thursday, February 12, 2015 12:13 AM To: Akhil Das Cc: Michael Nazario; user@spark.apache.org Subject: Re: PySpark 1.2 Hadoop version mismatch No, "mr1" should not be the issue here, and I think that would break other things. The OP is not using mr1. client 4 / server 7 mea

Re: PySpark 1.2 Hadoop version mismatch

2015-02-12 Thread Sean Owen
No, "mr1" should not be the issue here, and I think that would break other things. The OP is not using mr1. client 4 / server 7 means roughly "client is Hadoop 1.x, server is Hadoop 2.0.x". Normally, I'd say I think you are packaging Hadoop code in your app by brining in Spark and its deps. Your a

Re: PySpark 1.2 Hadoop version mismatch

2015-02-11 Thread Akhil Das
Did you have a look at http://spark.apache.org/docs/1.2.0/building-spark.html I think you can simply download the source and build for your hadoop version as: mvn -Dhadoop.version=2.0.0-mr1-cdh4.7.0 -DskipTests clean package Thanks Best Regards On Thu, Feb 12, 2015 at 11:45 AM, Michael Nazario

RE: PySpark 1.2 Hadoop version mismatch

2015-02-11 Thread Michael Nazario
I also forgot some other information. I have made this error go away by making my pyspark application use spark-1.1.1-bin-cdh4 for the driver, but communicate with a spark 1.2 master and worker. It's not a good workaround, so I would like to have the driver also be spark 1.2 Michael ___