Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-01 Thread Davis Varghese
Sure. I will get one over the weekend -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-01 Thread Bago Amirbekian
Davis I'm looking into this. If you could include some code that I can use to reproduce the error & the stack trace it would be really helpful. On Fri, Oct 20, 2017 at 11:01 AM Joseph Bradley wrote: > Hi Davis, > We've started tracking these issues under this umbrella: > https://issues.apache.or

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Noman Khan
+1 for ultra-low latency Thanks & Regards Noman Get Outlook for Android From: Reynold Xin Sent: Wednesday, 1 November, 21:07 Subject: [Vote] SPIP: Continuous Processing Mode for Structured Streaming To: dev@spark.apache.org Earlier I sent out a discussion thread for CP

SPARK-22211: Removing an incorrect FOJ optimization

2017-11-01 Thread Henry Robinson
Hi - I'm digging into some Spark SQL tickets, and wanted to ask a procedural question about SPARK-22211 and optimizer changes in general. To summarise the JIRA, Catalyst appears to be incorrectly pushing a limit down below a FULL OUTER JOIN, risking possibly incorrect results. I don't believe the

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Reynold Xin
I just replied. On Wed, Nov 1, 2017 at 5:50 PM, Cody Koeninger wrote: > Was there any answer to my question around the effect of changes to > the sink api regarding access to underlying offsets? > > On Wed, Nov 1, 2017 at 11:32 AM, Reynold Xin wrote: > > Most of those should be answered by the

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Cody Koeninger
Was there any answer to my question around the effect of changes to the sink api regarding access to underlying offsets? On Wed, Nov 1, 2017 at 11:32 AM, Reynold Xin wrote: > Most of those should be answered by the attached design sketch in the JIRA > ticket. > > On Wed, Nov 1, 2017 at 5:29 PM De

Announcing Spark on Kubernetes release 0.5.0

2017-11-01 Thread Yinan Li
The Spark on Kubernetes development community is pleased to announce release 0.5.0 of Apache Spark with Kubernetes as a native scheduler back-end! This release includes a few bug fixes and the following features: - Spark R support - Kubernetes 1.8 support - Mounts emptyDir volumes for te

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Reynold Xin
Most of those should be answered by the attached design sketch in the JIRA ticket. On Wed, Nov 1, 2017 at 5:29 PM Debasish Das wrote: > +1 > > Is there any design doc related to API/internal changes ? Will CP be the > default in structured streaming or it's a mode in conjunction with > exisiting

Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Debasish Das
+1 Is there any design doc related to API/internal changes ? Will CP be the default in structured streaming or it's a mode in conjunction with exisiting behavior. Thanks. Deb On Nov 1, 2017 8:37 AM, "Reynold Xin" wrote: Earlier I sent out a discussion thread for CP in Structured Streaming: ht

[Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-01 Thread Reynold Xin
Earlier I sent out a discussion thread for CP in Structured Streaming: https://issues.apache.org/jira/browse/SPARK-20928 It is meant to be a very small, surgical change to Structured Streaming to enable ultra-low latency. This is great timing because we are also designing and implementing data so

Re: [SS] Custom Sinks

2017-11-01 Thread Reynold Xin
They will probably both change, but I wouldn't block on the change if you have an immediate need. On Wed, Nov 1, 2017 at 10:41 AM, Anton Okolnychyi < anton.okolnyc...@gmail.com> wrote: > Hi all, > > I have a question about the future of custom data sinks in Structured > Streaming. In particular,

[SS] Custom Sinks

2017-11-01 Thread Anton Okolnychyi
Hi all, I have a question about the future of custom data sinks in Structured Streaming. In particular, I want to know how continuous processing and the Datasource API V2 will impact them. Right now, it is possible to have custom data sinks via the current Datasource API (V1) by implementing Stre