Hi Aljoscha, Cool! I created a JIRA for this. https://issues.apache.org/jira/browse/FLINK-4266 Some comments inline.
Chen On Mon, Jul 25, 2016 at 2:41 AM, Aljoscha Krettek <[email protected]> wrote: > Hi, > I thought there was a Jira for that but I looked and couldn't find it. If > you'd like you can create one and we can discuss the design. Do you have > any ideas yet? > > The tricky things I see in this are: > - Knowing which data is the current data. This will require some kind of > timestamps or increasing IDs. > We are thinking of leveraging client assigned timestamp from checkpoint_timestamp. > - Knowing when you can retire data from Cassandra > That's interesting part, each state checkpoint snapshot might reference t's previous snapshot. Delete/Consolidate rows previous snapshot with eventual consistency can be tricky. > Some of these might require some changes to how Flink handles checkpoints > and it somewhat goes into the direction of incremental checkpoints. That > last part is especially important once you deal with savepoints, which can > stay around indefinitely. > > Cheers, > Aljoscha > > On Mon, 25 Jul 2016 at 08:31 Tai Gordon <[email protected]> wrote: > > > Hi Chen, > > > > AFAIK, there currently isn’t any FLIP / JIRA / work currently for a > > Cassandra state backend. I think it’ll definitely by interesting to have > > one in Flink. > > > > Regards, > > Gordon > > > > > > On July 25, 2016 at 10:24:32 AM, Chen Qin ([email protected]) wrote: > > > > Hi there, > > > > Is there any design docs or on going efforts there? > > > > Thanks, > > Chen > > >
