Re: [DISCUSS] FLIP-201: Persist local state in working directory

Till Rohrmann Thu, 30 Dec 2021 07:29:15 -0800

Hi David,

Thanks for your feedback.

With the graceful shutdown I mean a way to stop the TaskManager and to
clean up the working directory. At the moment, I think we always kill the
process via SIGTERM or SIGKILL. This won't clean up the working directory
because it could also originate from a process failure. I think what Niklas
does is to introduce a signal handler to react to SIGTERM to disconnect
from the JobMaster.

You are right that by default Flink will now set the RocksDB directory to
the working temp directory. Before it defaulted to the spilling
directories. I think this is not a problem because users can still manually
configure multiple RocksDB directories via state.backend.rocksdb.localdir.
Moreover, I am not sure how well this mechanism works in practice. Flink
will simply iterate through the directories when creating new RocksDB state
backends w/o a lot of smartness. If one of the directories is full, then
Flink won't use another one but simply fail.

I do see the point of a proper serialization format and I agree that we
should eventually implement it. My reasoning was that the PR is already
quite big and I would prefer getting it in and then tackling this problem
as a follow-up instead of increasing the scope of the changes further
because the serialization format is not required for this feature (strictly
speaking). I hope that this makes sense.

I will also respond to your PR comments.

Cheers,
Till

On Thu, Dec 30, 2021 at 4:00 PM David Morávek <[email protected]> wrote:

> Hi Till,
>
> thanks for drafting the FLIP, it looks really good. I did a quick pass over
> the PR and it seems to be heading in a right direction.
>
> It might be required to introduce a graceful shutdown of the TaskExecutor
> > in order to support proper cleanup of resources.
> >
>
> This is actively being worked on by Niklas in FLINK-25277 [1].
>
> In the PR, I've seen that you're also replacing directories for storing the
> local state with the working directory. Should this be a concern? Is for
> example rocksdb able to leverage multiple mount paths for spreading the
> load?
>
> I'd also be in favor of introducing a proper (evolving) serialization
> format right away instead of the Java serialization, but no hard feelings
> if we don't.
>
> [1] https://issues.apache.org/jira/browse/FLINK-25277
>
> Best,
> D.
>
> On Wed, Dec 29, 2021 at 4:58 PM Till Rohrmann <[email protected]>
> wrote:
>
> > I've created draft PR for the desired changes [1]. It might be easier to
> > take a look at than the branch.
> >
> > [1] https://github.com/apache/flink/pull/18237
> >
> > Cheers,
> > Till
> >
> > On Tue, Dec 28, 2021 at 3:22 PM Till Rohrmann <[email protected]>
> > wrote:
> >
> > > Hi everyone,
> > >
> > > I would like to start a discussion about using the working directory to
> > > persist local state for faster recovery (FLIP-201) [1]. Persisting the
> > > local state will be beneficial if a crashed process is restarted with
> the
> > > same working directory. In this case, Flink does not have to download
> the
> > > state artifacts again and can recover locally.
> > >
> > > A POC can be found here [2].
> > >
> > > Looking forward to your feedback.
> > >
> > > [1] https://cwiki.apache.org/confluence/x/wJuqCw
> > > [2] https://github.com/tillrohrmann/flink/tree/FLIP-201
> > >
> > > Cheers,
> > > Till
> > >
> >
>

Re: [DISCUSS] FLIP-201: Persist local state in working directory

Reply via email to