Thanks - appreciate the response. The reason I want to control these things is this - my state grows and shrinks over time (user level windowing as application state). I would like to trigger checkpoints just after the state has been crunched/compressed (at the window boundary). Say I crunch every 10 seconds and slide the window by 8 seconds (2 second overlap). My window buffers would only need to checkpoint 2s worth of in-flight data (apart from compressed state for the other 8 seconds). With flink this seems hard given the windowing is at a partition level and not a global window. Even if I use event time, every partition will be at different points (of when that partition becomes ready to crunch). OTOH, if I were to introduce barriers at source, I could ensure that I get a good point globally to crunch and checkpoint my state.
Does the checkpoint co-ordinator provide triggers to application to “crunch now" and reduce state? BTW, this may not be optimal, because applications would have natural triggers to crunch windows (and can’t just react to these triggers at random points). Is there any benefit of allowing flink to do the windowing (in terms of getting smaller checkpoints)? I read that flink does not checkpoint in-flight data, but this would be impossible with event time and out of order processing by operators (I can see how it would work with processing time, and in order crunching). When the barrier hits the operator, flink will have to checkpoint all active event time windows. Given these event time windows have different trigger points, it might help to checkpoint right after trigger evaluation so the state is compressed (and there is less things to checkpoint). There seems to be some relationship between watermarks, triggers and checkpoint that is someone not being leveraged. -Abhishek- > On May 19, 2016, at 5:48 AM, Till Rohrmann <trohrm...@apache.org> wrote: > > Hi Abhishek, > > you can implement custom sources by implementing the SourceFunction or the > ParallelSourceFunction interface and then calling > StreamExecutionEnvironment.addSource. > > At the moment, it is not possible to control manually or from a source > function when to trigger a checkpoint. This is the responsibility of the > CheckpointCoordinator. > > Cheers, > Till > > > On Tue, May 17, 2016 at 8:28 PM, Abhishek R. Singh > <abhis...@tetrationanalytics.com <mailto:abhis...@tetrationanalytics.com>> > wrote: > Hi, > > Can we define custom sources in link? Control the barriers and (thus) > checkpoints at good watermark points? > > -Abhishek- >