ReduceByKeyAndWindow with inverse function has certain unique
characteristics - it reuses a lot of the intermediate partitioned,
partially-reduced data. For every new batch, it "reduces" the new data, and
"inverse reduces" the old out-of-window data. That inverse-reduce needs
data from an old batch
Hi,
I am using spark streaming check-pointing mechanism and reading the data
from Kafka. The window duration for my application is 2 hrs with a sliding
interval of 15 minutes.
So, my batches run at following intervals...
- 09:45
- 10:00
- 10:15
- 10:30
- and so on
When my job is