Hi Giancarlo! Cool that you are working on a Kinesis connector, very exciting :-)
To have a look at the Kafka fault tolerance, you can check out this blog post, it explains it in one of the later sections: http://data-artisans.com/kafka-flink-a-practical-how-to/ A general overview of checkpointing is here: https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html The basic principles behind the Kafka checkpointing are the following: 1) Partition-to-Source assignment is deterministic. On a retry run, the parallel source task n will get the same partitions assigned. 2) Whenever a record is fetched from Kafka, the "offset" (incrementing position) is fetched as well and remembered locally. We always keep the position of the last fetched element in each partition. 3) That position is stored as part of the checkpointed, to define where in a partition the stream was at the time of the checkpoint. On a restore-after-failure, we set the source to start reading from that position. Greetings, Stephan On Fri, Sep 18, 2015 at 12:00 PM, Giancarlo Pagano <gianca...@beamly.com> wrote: > Hi Stephan, > > I’m not a lot familiar with Kafka on the other hand, but I think they > offer a very similar abstraction. Kinesis has a low-level api and an high > level consumer, the Kinesis Client Library (KCL). > I‘ve implemented a first version of the connector using the KCL, that I’ve > been using for testing. > It doesn’t support checkpointing yet, I’ll have a better look at the Flink > Kafka Consumer and see what needs to be done to add support for > checkpoints. I’ll probably need more help for that. > > Thanks, > Giancarlo > > > > On 17 Sep 2015, at 12:27, Stephan Ewen <se...@apache.org> wrote: > > > > Hi Giancarlo! > > > > I am not aware of any existing Kinesis connector. Would be definitely > something to put onto the roadmap for the near future. This is a stream > source we should support similarly to Kafka. > > > > I am not super familiar with Kinesis, but it looks a bit like offering a > similar abstraction as Kafka, especially with the ability to read the > streams from specific positions. That way, it should be possible to follow > the same design as the Kafka connector (even simpler, if they don't have > the tricky offset committing part of Kafka). > > > > Greetings, > > Stephan > > > > > > On Thu, Sep 17, 2015 at 12:54 PM, Giancarlo Pagano <gianca...@beamly.com> > wrote: > > Hi, > > > > Is there any project already working on a Kinesis connector for Flink or > any plan to add a Kinesis connector to the main Flink distribution in the > future? > > > > Thanks, > > Giancarlo > > > >