[ https://issues.apache.org/jira/browse/FLINK-6178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15944776#comment-15944776 ]
Xiaogang Shi commented on FLINK-6178: ------------------------------------- [~tzulitai] Thanks a lot for your quick response. The changes to the interfaces in {{RuntimeContext}} sound great! They do help in the conversion of savepoints. Looking forwards to them. > Allow upgrades to state serializers > ----------------------------------- > > Key: FLINK-6178 > URL: https://issues.apache.org/jira/browse/FLINK-6178 > Project: Flink > Issue Type: New Feature > Components: State Backends, Checkpointing, Type Serialization System > Reporter: Tzu-Li (Gordon) Tai > Assignee: Tzu-Li (Gordon) Tai > > Currently, users are locked in with the serializer implementation used to > write their state. > This is suboptimal, as generally for users, it could easily be possible that > they wish to change their serialization formats / state schemas and types in > the future. > This is an umbrella JIRA for the required tasks to make this possible. > Here's an overview description of what to expect for the overall outcome of > this JIRA (the specific details are outlined in their respective subtasks): > Ideally, the main user-facing change this would result in is that users > implementing their custom {{TypeSerializer}} s will also need to implement > hook methods that identify whether or not there is a change to the serialized > format or even a change to the serialized data type. It would be the user's > responsibility that the {{deserialize}} method can bridge the change between > the old / new formats. > For Flink's built-in serializers that are automatically built using the > user's configuration (most notably the more complex {{KryoSerializer}} and > {{GenericArraySerializer}}), Flink should be able to automatically > "reconfigure" them using the new configuration, so that the reconfigured > versions can be used to de- / serialize previous state. This would require > knowledge of the previous configuration of the serializer, therefore > "serializer configuration metadata" will be added to savepoints. > Note that for the first version of this, although additional infrastructure > (e.g. serializer reconfigure hooks, serializer configuration metadata in > savepoints) will be added to potentially allow Kryo version upgrade, this > JIRA will not cover this. Kryo has breaking binary formats across major > versions, and will most likely need some further changes. Therefore, for the > {{KryoSerializer}}, "upgrading" it simply means changes in the registration > of specific / default serializers, at least for now. > Finally, we would need to add a "convertState" phase to the task lifecycle, > that takes place after the "open" phase and before checkpointing starts / the > task starts running. It can only happen after "open", because only then can > we be certain if any reconfiguration of state serialization has occurred, and > state needs to be converted. Ideally, the code for the "convertState" is > designed so that it can be easily exposed as an offline tool in the future. > For this JIRA, we should simply assume that after {{open()}}, we have all the > required information and serializers are appropriately reconfigured. > [~srichter] is currently planning to deprecate RuntimeContext state > registration methods in favor of a new interface that enforces eager state > registration, so that we may have all the info after {{open()}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)