These are true, but it's not because Spark is written in Scala; it's
because it executes in the JVM. So, Scala/Java-based apps have an
advantage in that they don't have to serialize data back and forth to
a Python process, which also brings a new set of things that can go
wrong. Python is also inherently slower to execute. There is a real
runtime performance hit.

Python APIs lag a bit, especially in areas where you need to integrate
with third-party (JVM-based) components, like Kafka or something.

I would certainly choose Scala all else equal. But then again, I don't
like Python, so I'd say that. Pyspark is certainly usable but does
have its cost.

On Tue, Oct 6, 2015 at 11:15 PM, dant <dan.tr...@gmail.com> wrote:
> Hi
>
> I'm hearing a common theme running that I should only do serious programming
> in Scala on Spark (1.5.1). Real power users use Scala. It is said that
> Python is great for analytics but in the end the code should be written to
> Scala to finalise. There are a number of reasons I'm hearing:
>
> 1. Spark is written in Scala so will always be faster than any other
> language implementation on top of it.
> 2. Spark releases always favour more features being visible and enabled for
> Scala API than Python API.
>
> Are there any truth's to the above? I'm a little sceptical.
>
> Apologies for the duplication, my previous message was held up due to
> subscription issue. Reposting now.
>
> Thanks
> Dan
>
>
>
>
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/Does-feature-parity-exist-between-Spark-and-PySpark-tp24963.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to