That looks roughly right, though you will want to mark Spark dependencies as provided. Do you need netlib directly? Pyspark won't matter here if you're in Scala; what's installed with pip would not matter in any event.
On Tue, Aug 25, 2020 at 3:30 AM Aviad Klein <aviad.kl...@fundbox.com> wrote: > > Hey Chris and Sean, thanks for taking the time to answer. > > Perhaps my installation of pyspark is off, although I did use version 2.4.4 > When developing in scala and pyspark how do you setup your environment? > > I used sbt for scala spark > > libraryDependencies ++= Seq( > "org.apache.spark" %% "spark-core" % "2.4.4", > "org.apache.spark" %% "spark-sql" % "2.4.4", > "org.scalactic" %% "scalactic" % "3.1.2", > "org.scalatest" %% "scalatest" % "3.1.2" % "test", > "org.apache.spark" %% "spark-mllib" % "2.4.4", > "org.plotly-scala" %% "plotly-render" % "0.7.2", > "com.github.fommil.netlib" % "all" % "1.1.2" pomOnly() > ) > > > and pip for pyspark (python 3.6.5) > > pip3 install pyspark==2.4.4 > > > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org