Thanks - appreciate the response.

The reason I want to control these things is this - my state grows and shrinks 
over time (user level windowing as application state). I would like to trigger 
checkpoints just after the state has been crunched/compressed (at the window 
boundary). Say I crunch every 10 seconds and slide the window by 8 seconds (2 
second overlap). My window buffers would only need to checkpoint 2s worth of 
in-flight data (apart from compressed state for the other 8 seconds). With 
flink this seems hard given the windowing is at a partition level and not a 
global window. Even if I use event time, every partition will be at different 
points (of when that partition becomes ready to crunch). OTOH, if I were to 
introduce barriers at source, I could ensure that I get a good point globally 
to crunch and checkpoint my state.

Does the checkpoint co-ordinator provide triggers to application to “crunch 
now" and reduce state? BTW, this may not be optimal, because applications would 
have natural triggers to crunch windows (and can’t just react to these triggers 
at random points).

Is there any benefit of allowing flink to do the windowing (in terms of getting 
smaller checkpoints)? I read that flink does not checkpoint in-flight data, but 
this would be impossible with event time and out of order processing by 
operators (I can see how it would work with processing time, and in order 
crunching). When the barrier hits the operator, flink will have to checkpoint 
all active event time windows. Given these event time windows have different 
trigger points, it might help to checkpoint right after trigger evaluation so 
the state is compressed (and there is less things to checkpoint). 

There seems to be some relationship between watermarks, triggers and checkpoint 
that is someone not being leveraged.

-Abhishek-

> On May 19, 2016, at 5:48 AM, Till Rohrmann <trohrm...@apache.org> wrote:
> 
> Hi Abhishek,
> 
> you can implement custom sources by implementing the SourceFunction or the 
> ParallelSourceFunction interface and then calling 
> StreamExecutionEnvironment.addSource.
> 
> At the moment, it is not possible to control manually or from a source 
> function when to trigger a checkpoint. This is the responsibility of the 
> CheckpointCoordinator.
> 
> Cheers,
> Till
> 
> 
> On Tue, May 17, 2016 at 8:28 PM, Abhishek R. Singh 
> <abhis...@tetrationanalytics.com <mailto:abhis...@tetrationanalytics.com>> 
> wrote:
> Hi,
> 
> Can we define custom sources in link? Control the barriers and (thus) 
> checkpoints at good watermark points?
> 
> -Abhishek-
> 

Reply via email to