Re: question on collect_list or say aggregations in general in structured streaming 2.3.0

2018-05-04 Thread kant kodali
3, 2018 at 1:52 AM > To: "user @spark" > Subject: Re: question on collect_list or say aggregations in general in > structured streaming 2.3.0 > > After doing some more research using Google. It's clear that aggregations > by default are stateful in Structure

Re: question on collect_list or say aggregations in general in structured streaming 2.3.0

2018-05-03 Thread Arun Mahadevan
: question on collect_list or say aggregations in general in structured streaming 2.3.0 After doing some more research using Google. It's clear that aggregations by default are stateful in Structured Streaming. so the question now is how to do stateless aggregations(not storing the resul

Re: question on collect_list or say aggregations in general in structured streaming 2.3.0

2018-05-03 Thread kant kodali
After doing some more research using Google. It's clear that aggregations by default are stateful in Structured Streaming. so the question now is how to do stateless aggregations(not storing the result from previous batches) using Structured Streaming 2.3.0? I am trying to do it using raw spark SQL

question on collect_list or say aggregations in general in structured streaming 2.3.0

2018-05-03 Thread kant kodali
Hi All, I was under an assumption that one needs to run grouby(window(...)) to run any stateful operations but looks like that is not the case since any aggregation like query "select count(*) from some_view" is also stateful since it stores the result of the count from the previous batch. Likew