Hi! Regarding the support for savepoint reading / writing / processing directly in core Flink, we've been thinking about that lately and might push a bit for adding the functionality to Flink in the next release. For example, beside Bravo, Seth (CC'ed) also had implemented something [1] for this. We should start thinking about converging the efforts of similar tools and supporting it in Flink soon. There's no official JIRA / feature proposal for this yet, but if you're interested, please keep an eye on the dev mailing list for it in the future.
Cheers, Gordon [1] https://github.com/sjwiesman/flink/tree/savepoint-connector On Thu, Mar 28, 2019 at 4:26 PM Gyula Fóra <gyula.f...@gmail.com> wrote: > Hi! > > I dont think there is any ongoing effort in core Flink other than this > library we created. > > You are probably right that it is pretty hacky at the moment. I would say > this one way we could do it that seemed convenient to me at the time I have > written the code. > > If you have ideas how to structure it better or improve it, you know > where to find the code, feel free to open a PR :) That might actually takes > us closer to having this properly in flink one day soon. > > Just to clarify the code you are showing: > writer.writeAll() -> Runs the batch job that writes the checkpoint files > for the changed operator states, returns the reference to the OperatorState > metadata object > StateMetadataUtils.createNewSavepoint() -> Replaces the metadata for the > operator states you have just written in the previous savepoint > StateMetadataUtils.writeSavepointMetadata() -> Writes a new metadata file > > So metadata writing happens as the very last step after the batch job has > run. This is similar to how it works in streaming jobs in the sense there > the jobmanager writes the metafile after the checkpointing is done. The > downside of this approach is that the client might not have access to write > the metafile here. > > Gyula > > >