thanks for quick reply, I will check the link.
Hopefully, with conversion to py3, or 3.4 we could take advantage of
asyncio and other cool new stuff ...
On Thu, Jan 29, 2015 at 7:41 PM, Reynold Xin wrote:
> It is something like this:
> https://issues.apache.org/jira/browse/SPARK-5097
>
> On the
It is something like this: https://issues.apache.org/jira/browse/SPARK-5097
On the master branch, we have a Pandas like API already.
On Thu, Jan 29, 2015 at 4:31 PM, Sasha Kacanski wrote:
> Hi Reynold,
> In my project I want to use Python API too.
> When you mention DF's are we talking about p
Hi Reynold,
In my project I want to use Python API too.
When you mention DF's are we talking about pandas or this is something
internal to spark py api.
If you could elaborate a bit on this or point me to alternate documentation.
Thanks much --sasha
On Thu, Jan 29, 2015 at 4:12 PM, Reynold Xin wr
Once the data frame API is released for 1.3, you can write your thing in
Python and get the same performance. It can't express everything, but for
basic things like projection, filter, join, aggregate and simple numeric
computation, it should work pretty well.
On Thu, Jan 29, 2015 at 12:45 PM, rt
Hey,
Without having Python as fast as Scala/Java, I think it's impossible to similar
performance in PySpark as in Scala/Java. Jython is also much slower than
Scala/Java.
With Jython, we can avoid the cost of manage multiple process and RPC,
we may still need to do the data conversion between Java