Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Liang-Chi Hsieh
+1 Xiao Li wrote > +1 > > Xiao > On Mon, 11 Sep 2017 at 6:44 PM Matei Zaharia < > matei.zaharia@ > > > wrote: > >> +1 (binding) >> >> > On Sep 11, 2017, at 5:54 PM, Hyukjin Kwon < > gurwls223@ > > wrote: >> > >> > +1 (non-binding) >> > >> > >> > 2017-09-12 9:52 GMT+09:00 Yin Huai < > yhuai

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Liang-Chi Hsieh
+1 Xiao Li wrote > +1 > > Xiao > On Mon, 11 Sep 2017 at 6:44 PM Matei Zaharia < > matei.zaharia@ > > > wrote: > >> +1 (binding) >> >> > On Sep 11, 2017, at 5:54 PM, Hyukjin Kwon < > gurwls223@ > > wrote: >> > >> > +1 (non-binding) >> > >> > >> > 2017-09-12 9:52 GMT+09:00 Yin Huai < > yhuai

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Xiao Li
+1 Xiao On Mon, 11 Sep 2017 at 6:44 PM Matei Zaharia wrote: > +1 (binding) > > > On Sep 11, 2017, at 5:54 PM, Hyukjin Kwon wrote: > > > > +1 (non-binding) > > > > > > 2017-09-12 9:52 GMT+09:00 Yin Huai : > > +1 > > > > On Mon, Sep 11, 2017 at 5:47 PM, Sameer Agarwal > wrote: > > +1 (non-bindin

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Matei Zaharia
+1 (binding) > On Sep 11, 2017, at 5:54 PM, Hyukjin Kwon wrote: > > +1 (non-binding) > > > 2017-09-12 9:52 GMT+09:00 Yin Huai : > +1 > > On Mon, Sep 11, 2017 at 5:47 PM, Sameer Agarwal wrote: > +1 (non-binding) > > On Thu, Sep 7, 2017 at 9:10 PM, Bryan Cutler wrote: > +1 (non-binding) for

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Hyukjin Kwon
+1 (non-binding) 2017-09-12 9:52 GMT+09:00 Yin Huai : > +1 > > On Mon, Sep 11, 2017 at 5:47 PM, Sameer Agarwal > wrote: > >> +1 (non-binding) >> >> On Thu, Sep 7, 2017 at 9:10 PM, Bryan Cutler wrote: >> >>> +1 (non-binding) for the goals and non-goals of this SPIP. I think it's >>> fine to wo

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Yin Huai
+1 On Mon, Sep 11, 2017 at 5:47 PM, Sameer Agarwal wrote: > +1 (non-binding) > > On Thu, Sep 7, 2017 at 9:10 PM, Bryan Cutler wrote: > >> +1 (non-binding) for the goals and non-goals of this SPIP. I think it's >> fine to work out the minor details of the API during review. >> >> Bryan >> >> On

Re: [VOTE][SPIP] SPARK-21190: Vectorized UDFs in Python

2017-09-11 Thread Sameer Agarwal
+1 (non-binding) On Thu, Sep 7, 2017 at 9:10 PM, Bryan Cutler wrote: > +1 (non-binding) for the goals and non-goals of this SPIP. I think it's > fine to work out the minor details of the API during review. > > Bryan > > On Wed, Sep 6, 2017 at 5:17 AM, Takuya UESHIN > wrote: > >> Hi all, >> >>

Re: Easy way to get offset metatada with Spark Streaming API

2017-09-11 Thread Cody Koeninger
https://issues-test.apache.org/jira/browse/SPARK-18258 On Mon, Sep 11, 2017 at 7:15 AM, Dmitry Naumenko wrote: > Hi all, > > It started as a discussion in > https://stackoverflow.com/questions/46153105/how-to-get-kafka-offsets-with-spark-structured-streaming-api. > > So the problem that there is

Re: Timestamp interoperability design doc available for review

2017-09-11 Thread Imran Rashid
I've posted a design doc on SPARK-12297, which builds on what Zoltan posted here earlier. It addresses the parquet issues and also considers current inconsistencies in timestamp behavior for spark across data formats and versions. I believe this incorporates all of the prior concerns and feedback

Easy way to get offset metatada with Spark Streaming API

2017-09-11 Thread Dmitry Naumenko
Hi all, It started as a discussion in https://stackoverflow.com/questions/46153105/how-to-get-kafka-offsets-with-spark-structured-streaming-api . So the problem that there is no support in Public API to obtain the Kafka (or Kineses) offsets. For example, if you want to save offsets in external st

[SPARK-20199][ML] : Provided featureSubsetStrategy to GBTClassifier and GBTRegressor

2017-09-11 Thread Pralabh Kumar
Hi Developers Can somebody look into this pull request . Its being reviewed by MLnick , sethah , mpjlu

Re: Putting Kafka 0.8 behind an (opt-in) profile

2017-09-11 Thread Sean Owen
Pull request is ready to go: https://github.com/apache/spark/pull/19134 I flag it one more time because it means Kafka 0.8 is deprecated in 2.3.0 and because it will require -Pkafka-0-8 to build in the support now. Pardon, I want to be sure: does this mean Pyspark Kafka support effectively has no

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2 read path

2017-09-11 Thread Wenchen Fan
This vote passes with 4 binding +1 votes, 10 non-binding votes, one +0 vote, and no -1 votes. Thanks all! +1 votes (binding): Wenchen Fan Herman van Hövell tot Westerflier Michael Armbrust Reynold Xin +1 votes (non-binding): Xiao Li Sameer Agarwal Suresh Thalamati Ryan Blue Xingbo Jiang Dongjoo

Re: [VOTE] [SPIP] SPARK-15689: Data Source API V2 read path

2017-09-11 Thread Wenchen Fan
yea, join push down (providing the other reader and join conditions) and aggregate push down (providing grouping keys and aggregate functions) can be added via the current framework in the future. On Mon, Sep 11, 2017 at 1:54 PM, Hemant Bhanawat wrote: > +1 (non-binding) > > I have found the sug