Re: PairRDD serialization exception

2015-03-12 Thread dylanhockey
I have the same exact error. Am running a pyspark job in yarn-client mode. Works well in standalone but I need to run it in yarn-client mode. Other people reported the same problem when bundling jars and extra dependencies. I'm pointing the pyspark to use a specific python executable bundled

Re: PairRDD serialization exception

2015-03-11 Thread Manas Kar
Hi Sean, Below is the sbt dependencies that I am using. I gave another try by removing the "provided" keyword which failed with the same error. What confuses me is that the stack trace appears after few of the stages have already run completely. object V { val spark = "1.2.0-cdh5.3.0" v

Re: PairRDD serialization exception

2015-03-11 Thread Sean Owen
This usually means you are mixing different versions of code. Here it is complaining about a Spark class. Are you sure you built vs the exact same Spark binaries, and are not including them in your app? On Wed, Mar 11, 2015 at 1:40 PM, manasdebashiskar wrote: > (This is a repost. May be a simpler