I have a project which runs fine in both Spark 1.6.2 and 2.1.0. It calculates a logistic model using MLlib. I compiled the 2.1 today from source and took the version 1 as a precompiled version with Hadoop. The odd thing is that on 1.6.2 the project produces an answer in 350 sec and the 2.1.0 takes 990 sec. Identical code using pyspark. I'm wondering if there is something in the setup params for 1.6 and 2.1, say number of executors or memory allocation, which might account for this? I'm using just the 4 cores of my machine as master and executors.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to