Thank you, Amit! I was looking for this kind of information.

I did not fully read your paper, I see in it a TODO with basically the same 
question(s) [1], maybe someone from Spark team (including Databricks) will be 
so kind to send some feedback..

Best,
Ovidiu

[1] Integrate “Structured Streaming”: //TODO - What (and how) will Spark 2.0 
support (out-of-order, event-time windows, watermarks, triggers, accumulation 
modes) - how straight forward will it be to integrate with the Beam Model ?


> On 21 May 2016, at 23:00, Sela, Amit <ans...@paypal.com> wrote:
> 
> It seems I forgot to add the link to the “Technical Vision” paper so there it 
> is - 
> https://docs.google.com/document/d/1y4qlQinjjrusGWlgq-mYmbxRW2z7-_X5Xax-GG0YsC0/edit?usp=sharing
> 
> From: "Sela, Amit" <ans...@paypal.com <mailto:ans...@paypal.com>>
> Date: Saturday, May 21, 2016 at 11:52 PM
> To: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr 
> <mailto:ovidiu-cristian.ma...@inria.fr>>, "user @spark" 
> <user@spark.apache.org <mailto:user@spark.apache.org>>
> Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com 
> <mailto:ovidiu21ma...@gmail.com>>
> Subject: Re: What / Where / When / How questions in Spark 2.0 ?
> 
> This is a “Technical Vision” paper for the Spark runner, which provides 
> general guidelines to the future development of Spark’s Beam support as part 
> of the Apache Beam (incubating) project.
> This is our JIRA - 
> https://issues.apache.org/jira/browse/BEAM/component/12328915/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel
>  
> <https://issues.apache.org/jira/browse/BEAM/component/12328915/?selectedTab=com.atlassian.jira.jira-projects-plugin:component-summary-panel>
> 
> Generally, I’m currently working on Datasets integration for Batch (to 
> replace RDD) against Spark 1.6, and going towards enhancing Stream processing 
> capabilities with Structured Streaming (2.0)
> 
> And you’re welcomed to ask those questions at the Apache Beam (incubating) 
> mailing list as well ;)
> http://beam.incubator.apache.org/mailing_lists/ 
> <http://beam.incubator.apache.org/mailing_lists/>
> 
> Thanks,
> Amit
> 
> From: Ovidiu-Cristian MARCU <ovidiu-cristian.ma...@inria.fr 
> <mailto:ovidiu-cristian.ma...@inria.fr>>
> Date: Tuesday, May 17, 2016 at 12:11 AM
> To: "user @spark" <user@spark.apache.org <mailto:user@spark.apache.org>>
> Cc: Ovidiu Cristian Marcu <ovidiu21ma...@gmail.com 
> <mailto:ovidiu21ma...@gmail.com>>
> Subject: Re: What / Where / When / How questions in Spark 2.0 ?
> 
> Could you please consider a short answer regarding the Apache Beam Capability 
> Matrix todo’s for future Spark 2.0 release [4]? (some related references 
> below [5][6])
> 
> Thanks
> 
> [4] http://beam.incubator.apache.org/capability-matrix/#cap-full-what 
> <http://beam.incubator.apache.org/capability-matrix/#cap-full-what>
> [5] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101 
> <https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-101>
> [6] https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102 
> <https://www.oreilly.com/ideas/the-world-beyond-batch-streaming-102>
> 
>> On 16 May 2016, at 14:18, Ovidiu-Cristian MARCU 
>> <ovidiu-cristian.ma...@inria.fr <mailto:ovidiu-cristian.ma...@inria.fr>> 
>> wrote:
>> 
>> Hi,
>> 
>> We can see in [2] many interesting (and expected!) improvements (promises) 
>> like extended SQL support, unified API (DataFrames, DataSets), improved 
>> engine (Tungsten relates to ideas from modern compilers and MPP databases - 
>> similar to Flink [3]), structured streaming etc. It seems we somehow assist 
>> at a smart unification of Big Data analytics (Spark, Flink - best of two 
>> worlds)!
>> 
>> How does Spark respond to the missing What/Where/When/How questions 
>> (capabilities) highlighted in the unified model Beam [1] ?
>> 
>> Best,
>> Ovidiu
>> 
>> [1] 
>> https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective
>>  
>> <https://cloud.google.com/blog/big-data/2016/05/why-apache-beam-a-google-perspective>
>> [2] 
>> https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html
>>  
>> <https://databricks.com/blog/2016/05/11/spark-2-0-technical-preview-easier-faster-and-smarter.html>
>> [3] http://stratosphere.eu/project/publications/ 
>> <http://stratosphere.eu/project/publications/>
>> 
>> 
> 

Reply via email to