This is great, Gyula! A colleague here at Lyft has also done some work around bootstrapping DataStream programs and we've also talked a bit about doing this by running DataSet programs.
On Fri, Aug 17, 2018 at 3:28 AM, Gyula Fóra <gyula.f...@gmail.com> wrote: > Hi All! > > I want to share with you a little project we have been working on at King > (with some help from some dataArtisans folks). I think this would be a > valuable addition to Flink and solve a bunch of outstanding production > use-cases and headaches around state bootstrapping and state analytics. > > We have built a quick and dirty POC implementation on top of Flink 1.6, > please check the README for some nice examples to get a quick idea: > > https://github.com/king/bravo > > *Short story* > Bravo is a convenient state reader and writer library leveraging the > Flink’s batch processing capabilities. It supports processing and writing > Flink streaming savepoints. At the moment it only supports processing > RocksDB savepoints but this can be extended in the future for other state > backends and checkpoint types. > > Our goal is to cover a few basic features: > > - Converting keyed states to Flink DataSets for processing and analytics > - Reading/Writing non-keyed operators states > - Bootstrap keyed states from Flink DataSets and create new valid > savepoints > - Transform existing savepoints by replacing/changing some states > > > Some example use-cases: > > - Point-in-time state analytics across all operators and keys > - Bootstrap state of a streaming job from external resources such as > reading from database/filesystem > - Validate and potentially repair corrupted state of a streaming job > - Change max parallelism of a job > > > Our main goal is to start working together with other Flink production > users and make this something useful that can be part of Flink. So if you > have use-cases please talk to us :) > I have also started a google doc which contains a little bit more info than > the readme and could be a starting place for discussions: > > https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpw > dhqBMr-ppkFL5E/edit?usp=sharing > > I know there are a bunch of rough edges and bugs (and no tests) but our > motto is: If you are not embarrassed, you released too late :) > > Please let me know what you think! > > Cheers, > Gyula >