I would like to create a custom aggregator function for a windowed KeyedStream 
which I have complete control over - i.e. instead of implementing an 
AggregatorFunction, I would like to control the lifecycle of the flink state by 
implementing the CheckpointedFunction interface, though I still want this state 
to be per-key, per-window. 

I am not sure which function I should be calling on the WindowedStream in order 
to invoke this custom functionality. I see from the documentation that 
CheckpointedFunction is for non-keyed state - which I guess eliminates this 
option.

A little background - I have logic that needs to hold a very large state in the 
operator - lots of counts by sub-key. Since only a sub-set of these 
aggregations are updated, I was interesting in trying out incremental 
checkpointing in RocksDB. However, I don't want to be hitting RocksDB I/O on 
every update of state since we need very low latency, and instead wanted to 
hold the state in Java Heap and then update the Flink state on checkpoint - i.e 
something like CheckpointedFunction.
My assumption is that any update I make to RocksDB backed state will hit the 
local disk - if this is wrong then I'll be happy

What other options do I have?

Thanks,
Hayden Marchant

Reply via email to