Hi, I'm very much interested in using Spark's MLlib in standalone programs. I've never used Hadoop, and don't intend to deploy on massive clusters. Building Spark has been an honest nightmare, and I've been on and off it for weeks.
The build always runs out of RAM on my laptop (4g of RAM, Arch Linux) when I try to build with Scala 2.11 support. No matter how I tweak JVM flags to reduce maximum RAM use, the build always crashes. When trying to build Spark 1.6.0 for Scala 2.10 just now, the build had compilation errors. Here is one, as a sample. I've saved the rest: [error] /home/colin/building/apache-spark/spark-1.6.0/repl/scala-2.10/src/main/scala/org/apache/spark/repl/SparkJLineReader.scala:16: object jline is not a member of package tools [error] import scala.tools.jline.console.completer._ It informs me: [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :spark-repl_2.10 I don't feel safe doing that, given that I don't know what my "<goals>" are. I've noticed that the build is compiling a lot of things I have no interest in. Is it possible to just compile the Spark core, its tools, and MLlib? I just want to experiment, and this is causing me a lot of stress. Thank you kindly, Colin