I'd also be interested in seeing such a benchmark.
On Tue, Apr 15, 2014 at 9:25 AM, Ian Ferreira <ianferre...@hotmail.com>wrote: > This would be super useful. Thanks. > > On 4/15/14, 1:30 AM, "Jeremy Freeman" <freeman.jer...@gmail.com> wrote: > > >Hi Andrew, > > > >I'm putting together some benchmarks for PySpark vs Scala. I'm focusing on > >ML algorithms, as I'm particularly curious about the relative performance > >of > >MLlib in Scala vs the Python MLlib API vs pure Python implementations. > > > >Will share real results as soon as I have them, but roughly, in our hands, > >that 40% number is ballpark correct, at least for some basic operations > >(e.g > >textFile, count, reduce). > > > >-- Jeremy > > > >--------------------- > >Jeremy Freeman, PhD > >Neuroscientist > >@thefreemanlab > > > > > > > >-- > >View this message in context: > > > http://apache-spark-user-list.1001560.n3.nabble.com/Scala-vs-Python-perfor > >mance-differences-tp4247p4261.html > >Sent from the Apache Spark User List mailing list archive at Nabble.com. > > >