Hi Seth,

that sounds reasonable. What I was asking for was not to reverse engineer binary format, but to make the savepoint write API a little more reusable, so that it could be wrapped into some other technology. I don't know the details enough to propose a solution, but it seems to me, that it could be possible to use something like Writer instead of Transform. Or maybe the Transform can use the Writer internally, the goal is just to enable to create the savepoint from "'outside" of Flink (with some library, of course).

Jan

On 5/31/19 1:17 PM, Seth Wiesman wrote:
@Konstantin agreed, that was a large impotence for this feature. Also I am 
happy to change the name to something that better  describes the feature set.

@Lan

Savepoints depend heavily on a number of flink internal components that may 
change between versions: state backends internals, type serializers, the 
specific hash function used to turn a UID into an OperatorID, etc. I consider 
it a feature of this proposal that the library depends on those internal 
components instead of reverse engineering the binary format. This way as those 
internals change, or new state features are added (think the recent addition of 
TTL) we will get support automatically. I do not believe anything else is 
maintainable.

Seth

On May 31, 2019, at 5:56 AM, Jan Lukavský <je...@seznam.cz> wrote:

Hi,

this is awesome, and really useful feature. If I might ask for one thing to 
consider - would it be possible to make the Savepoint manipulation API (at 
least writing the Savepoint) less dependent on other parts of Flink internals 
(e.g. |KeyedStateBootstrapFunction|) and provide something more general (e.g. 
some generic Writer)? Why I'm asking for that - I can totally imagine 
situation, where users might want to create bootstrapped state by some other 
runner (e.g. Apache Spark), and then run Apache Flink after the state has been 
created. This makes even more sense in context of Apache Beam, which provides 
all the necessary work to make this happen. The question is - would it be 
possible to design this feature so that writing the savepoint from different 
runner would be possible?

Cheers,

  Jan

On 5/30/19 1:14 AM, Seth Wiesman wrote:
Hey Everyone!
​
Gordon and I have been discussing adding a savepoint connector to flink for 
reading, writing and modifying savepoints.
​
This is useful for:
​
     Analyzing state for interesting patterns
     Troubleshooting or auditing jobs by checking for discrepancies in state
     Bootstrapping state for new applications
     Modifying savepoints such as:
         Changing max parallelism
         Making breaking schema changes
         Correcting invalid state
​
We are looking forward to your feedback!
​
This is the FLIP:
​
https://cwiki.apache.org/confluence/display/FLINK/FLIP-43%3A+Savepoint+Connector

Seth



Reply via email to