When will spark 2.0 support dataset python API?

2017-05-31 Thread Cyanny LIANG
Hi, Since DataSet API has become a common way to process structured data in spark 2.0, and Scala , Java API support dataset now, and When will python dataset API release? or are there some plans? Consider that, in our production environment, many users love to use python API, which has many machine

Re: [VOTE] Apache Spark 2.2.0 (RC2)

2017-05-31 Thread Kostas Sakellis
Hey Michael, There is a discussion on TIMESTAMP semantics going on the thread "SQL TIMESTAMP semantics vs. SPARK-18350" which might impact Spark 2.2. Should we make a decision there before voting on the next RC for Spark 2.2? Thanks, Kostas On Tue, May 30, 2017 at 12:09 PM, Michael Armbrust wro

Re: FYI - Kafka's built-in performance test tool

2017-05-31 Thread 郭健
It seems an internal page so I cannot access it: Your email address doesn't have access to equalum.atlassian.net 发件人: Ofir Manor 日期: 2017年5月26日 星期五 01:12 至: dev 主题: FYI - Kafka's built-in performance test tool comes with source code. Some basic results from the VM, * Write every second 50

Re: When will spark 2.0 support dataset python API?

2017-05-31 Thread Wenchen Fan
We tried but didn’t get much benefits from Python Dataset, as Python is dynamic typed and there is not much we can do to optimize running python functions. > On 31 May 2017, at 3:36 AM, Cyanny LIANG wrote: > > Hi, > Since DataSet API has become a common way to process structured data in spark

Re: FYI - Kafka's built-in performance test tool

2017-05-31 Thread Ofir Manor
Hi, sorry for that, I sent my original email to this list by mistake (gmail autocomplete fooled me), the page I linked isn't open to buplic. Anyway, since you are interested, here is the sample commands and output from a VirtualBox image on my laptop. 1. Create a topic kafka-topics.sh --create