Hi,
thanks for opening the Jira issue. I'll continue the discussion here
instead of in the Jira, I hope that's OK.

That last paragraph of yours is the most interesting. We will have to adapt
the way that checkpoints are stored to accommodate state backends that
store state in some external system, such as Cassandra. Right now, each
Checkpoint/Savepoint is stored in isolation and the system does not know
about any relation between them. We have to introduce such a relation,
basically putting the checkpoints into a graph structure that shows the
lineage of the checkpoints. Then, when we are cleaning up old checkpoints
we check the ranges of (logical) timestamps of the checkpoints that we can
remove and instruct the StateBackend to remove the relevant ranges.

This leads to another interesting thing. We might need to have a
StateBackend component running in the JobManager that we can invoke to
delete ranges of checkpoints. Right now, a StateBackend only lives on the
TaskManager, in the operators. Cleanup of time ranges, however, should
probably happen in some centralized location.

Cheers,
Aljoscha

On Mon, 25 Jul 2016 at 22:38 Chen Qin <qinnc...@gmail.com> wrote:

> Hi Aljoscha,
>
> Cool! I created a JIRA for this.
> https://issues.apache.org/jira/browse/FLINK-4266
> Some comments inline.
>
> Chen
>
> On Mon, Jul 25, 2016 at 2:41 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
> > Hi,
> > I thought there was a Jira for that but I looked and couldn't find it. If
> > you'd like you can create one and we can discuss the design. Do you have
> > any ideas yet?
> >
> > The tricky things I see in this are:
> >  - Knowing which data is the current data. This will require some kind of
> > timestamps or increasing IDs.
> >
>
> ​We are thinking of leveraging client assigned timestamp from
> checkpoint_timestamp.
> ​
>
> >  - Knowing when you can retire data from Cassandra
> >
> ​That's interesting part, each state checkpoint snapshot might reference
> t's previous snapshot​. Delete/Consolidate rows previous snapshot with
> eventual consistency can be tricky.
>  ​
>
> > Some of these might require some changes to how Flink handles checkpoints
> > and it somewhat goes into the direction of incremental checkpoints. That
> > last part is especially important once you deal with savepoints, which
> can
> > stay around indefinitely.
> >
> > Cheers,
> > Aljoscha
> >
> > On Mon, 25 Jul 2016 at 08:31 Tai Gordon <tzuli...@gmail.com> wrote:
> >
> > > Hi Chen,
> > >
> > > AFAIK, there currently isn’t any FLIP / JIRA / work currently for a
> > > Cassandra state backend. I think it’ll definitely by interesting to
> have
> > > one in Flink.
> > >
> > > Regards,
> > > Gordon
> > >
> > >
> > > On July 25, 2016 at 10:24:32 AM, Chen Qin (qinnc...@gmail.com) wrote:
> > >
> > > ​Hi there,
> > >
> > > Is there any design docs or on going efforts there?
> > >
> > > Thanks,
> > > Chen ​
> > >
> >
>

Reply via email to