The window can be larger, the batch/slide interval has to be smaller (assuming
every 5-10 secs?).
You have a separate parameter on most default functions and you can override it
as long as it's a multiple of streaming context batch interval.
Sent from my iPhone
On 16 Sep 2015, at 23:30, Ted Yu
bq. and check if 5 minutes have passed
What if the duration for the window is longer than 5 minutes ?
Cheers
On Wed, Sep 16, 2015 at 1:25 PM, Adrian Tanase wrote:
> If you don't need the counts in betweem the DB writes, you could simply
> use a 5 min window for the updateStateByKey and use for
If you don't need the counts in betweem the DB writes, you could simply use a 5
min window for the updateStateByKey and use foreachRdd on the resulting DStream.
Even simpler, you could use reduceByKeyAndWindow directly.
Lastly, you could keep a variable on the driver and check if 5 minutes have
Hello.
I have a streaming job that is processing data. I process a stream of
events, taking actions when I see anomalous events. I also keep a count
events observed using updateStateByKey to maintain a map of type to count.
I would like to periodically (every 5 minutes) write the results of my
c