My 2 cents, "micro-batch" is the way how Spark handles stream, not a semantic we are considering. Semantically and ideally, same SQL query should provide same result between batch and streaming except late events once the operations in query are supported.
2018년 11월 2일 (금) 오후 3:54, kant kodali <kanth...@gmail.com>님이 작성: > If I can add one thing to this list I would say stateless aggregations > using Raw SQL. > > For example: As I read micro-batches from Kafka I want to do say a count > of that micro batch and spit it out using Raw SQL . (No Count aggregation > across batches.) > > > > On Tue, Oct 30, 2018 at 4:55 PM Jungtaek Lim <kabh...@gmail.com> wrote: > >> OK thanks for clarifying. I guess it is one of major features in >> streaming area and nice to add, but also agree it would require huge >> investigation. >> >> 2018년 10월 31일 (수) 오전 8:06, Michael Armbrust <mich...@databricks.com>님이 >> 작성: >> >>> Agree. Just curious, could you explain what do you mean by "negation"? >>>> Does it mean applying retraction on aggregated? >>>> >>> >>> Yeah exactly. Our current streaming aggregation assumes that the input >>> is in append-mode and multiple aggregations break this. >>> >>