Yes, check out
mapWithState:https://databricks.com/blog/2016/02/01/faster-stateful-stream-processing-in-apache-spark-streaming.html
_
From: Nikhil Goyal
Sent: Monday, May 23, 2016 23:28
Subject: Timed aggregation in Spark
To:
Hi all,
I want to aggregate my dat
Hi Iain,
Did you manage to solve this issue?
It looks like we have a similar issue with processing time increasing every
micro-batch but only after 30 batches.
Thanks.
On Thu, Mar 3, 2016 at 4:45 PM Iain Cundy wrote:
> Hi All
>
>
>
> I’m aggregating data using mapWithState with a timeout set in
, 2015 at 22:14 Cody Koeninger wrote:
> Solution 2 sounds better to me. You aren't always going to have graceful
> shutdowns.
>
> On Mon, Sep 14, 2015 at 1:49 PM, Ofir Kerker
> wrote:
>
>> Hi,
>> My Spark Streaming application consumes messages (events) from
Hi,
My Spark Streaming application consumes messages (events) from Kafka every
10 seconds using the direct stream approach and aggregates these messages
into hourly aggregations (to answer analytics questions like: "How many
users from Paris visited page X between 8PM to 9PM") and save the data to