These are true, but it's not because Spark is written in Scala; it's because it executes in the JVM. So, Scala/Java-based apps have an advantage in that they don't have to serialize data back and forth to a Python process, which also brings a new set of things that can go wrong. Python is also inherently slower to execute. There is a real runtime performance hit.
Python APIs lag a bit, especially in areas where you need to integrate with third-party (JVM-based) components, like Kafka or something. I would certainly choose Scala all else equal. But then again, I don't like Python, so I'd say that. Pyspark is certainly usable but does have its cost. On Tue, Oct 6, 2015 at 11:15 PM, dant <dan.tr...@gmail.com> wrote: > Hi > > I'm hearing a common theme running that I should only do serious programming > in Scala on Spark (1.5.1). Real power users use Scala. It is said that > Python is great for analytics but in the end the code should be written to > Scala to finalise. There are a number of reasons I'm hearing: > > 1. Spark is written in Scala so will always be faster than any other > language implementation on top of it. > 2. Spark releases always favour more features being visible and enabled for > Scala API than Python API. > > Are there any truth's to the above? I'm a little sceptical. > > Apologies for the duplication, my previous message was held up due to > subscription issue. Reposting now. > > Thanks > Dan > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Does-feature-parity-exist-between-Spark-and-PySpark-tp24963.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org