Re: [Discussion] Clarification regarding Stateful Aggregations over Structured Streaming

2018-12-16 Thread Chitral Verma
Thanks Stavros for the clarification, I'll create some documentation for the same and raise this as an enhancement issue with pull request. Meanwhile if users want to use this functionality, they can always add spark-states as a dependency and use it. O

Re: [Discussion] Clarification regarding Stateful Aggregations over Structured Streaming

2018-12-16 Thread Stavros Kontopoulos
Hi, Databricks runtime as you already know has this enhancement and so it is considered a good option if you want to decouple state from the jvm. Some arguments why to do so are given by the Flink paper along with incremental snapshotting: http://www.vldb.org/pvldb/vol10/p1718-carbone.pdf. Also ti

[Discussion] Clarification regarding Stateful Aggregations over Structured Streaming

2018-12-16 Thread Chitral Verma
Hi Devs, For quite some time i've been looking at the structured streaming API to solve lots of use cases at my workplace, I've have some doubts I wanted to clarify regarding stateful aggregations over structured streaming. Currently, spark provides flatMapGroupWithState (FMGWS) / mapGroupWithSta