+1. Being able to analyze the state is a huge operational advantage. Thanks Gyula for the POC and I would be very interested in contributing to the work.
-- Rong On Tue, Aug 21, 2018 at 4:26 AM Till Rohrmann <trohrm...@apache.org> wrote: > big +1 for this feature. A tool to get your state out of and into Flink > will be tremendously helpful. > > On Mon, Aug 20, 2018 at 10:21 AM Aljoscha Krettek <aljos...@apache.org> > wrote: > > > +1 I'd like to have something like this in Flink a lot! > > > > > On 19. Aug 2018, at 11:57, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > > > > Hi all! > > > > > > Thanks for the feedback and I'm happy there is some interest :) > > > Tomorrow I will start improving the proposal based on the feedback and > > will > > > get back to work. > > > > > > If you are interested working together in this please ping me and we > can > > > discuss some ideas/plans and how to share work. > > > > > > Cheers, > > > Gyula > > > > > > Paris Carbone <par...@kth.se> ezt írta (időpont: 2018. aug. 18., Szo, > > 9:03): > > > > > >> +1 > > >> > > >> Might also be a good start to implement queryable stream state with > > >> snapshot isolation using that mechanism. > > >> > > >> Paris > > >> > > >>> On 17 Aug 2018, at 12:28, Gyula Fóra <gyula.f...@gmail.com> wrote: > > >>> > > >>> Hi All! > > >>> > > >>> I want to share with you a little project we have been working on at > > King > > >>> (with some help from some dataArtisans folks). I think this would be > a > > >>> valuable addition to Flink and solve a bunch of outstanding > production > > >>> use-cases and headaches around state bootstrapping and state > analytics. > > >>> > > >>> We have built a quick and dirty POC implementation on top of Flink > 1.6, > > >>> please check the README for some nice examples to get a quick idea: > > >>> > > >>> https://github.com/king/bravo > > >>> > > >>> *Short story* > > >>> Bravo is a convenient state reader and writer library leveraging the > > >>> Flink’s batch processing capabilities. It supports processing and > > writing > > >>> Flink streaming savepoints. At the moment it only supports processing > > >>> RocksDB savepoints but this can be extended in the future for other > > state > > >>> backends and checkpoint types. > > >>> > > >>> Our goal is to cover a few basic features: > > >>> > > >>> - Converting keyed states to Flink DataSets for processing and > > >> analytics > > >>> - Reading/Writing non-keyed operators states > > >>> - Bootstrap keyed states from Flink DataSets and create new valid > > >>> savepoints > > >>> - Transform existing savepoints by replacing/changing some states > > >>> > > >>> > > >>> Some example use-cases: > > >>> > > >>> - Point-in-time state analytics across all operators and keys > > >>> - Bootstrap state of a streaming job from external resources such as > > >>> reading from database/filesystem > > >>> - Validate and potentially repair corrupted state of a streaming job > > >>> - Change max parallelism of a job > > >>> > > >>> > > >>> Our main goal is to start working together with other Flink > production > > >>> users and make this something useful that can be part of Flink. So if > > you > > >>> have use-cases please talk to us :) > > >>> I have also started a google doc which contains a little bit more > info > > >> than > > >>> the readme and could be a starting place for discussions: > > >>> > > >>> > > >> > > > https://docs.google.com/document/d/103k6wPX20kMu5H3SOOXSg5PZIaYpwdhqBMr-ppkFL5E/edit?usp=sharing > > >>> > > >>> I know there are a bunch of rough edges and bugs (and no tests) but > our > > >>> motto is: If you are not embarrassed, you released too late :) > > >>> > > >>> Please let me know what you think! > > >>> > > >>> Cheers, > > >>> Gyula > > >> > > >> > > > > >