GitHub user tzulitai opened a pull request: https://github.com/apache/flink/pull/3834
[FLINK-6425] [runtime] Activate serializer upgrades in state backends This is a follow-up PR that finalizes serializer upgrades, and is based on #3804 (therefore, only the 2nd and 3rd commits, ed82173 and e77096a is relevant). This PR includes the following changes: 1. Write configuration snapshots of serializers along with checkpoints (this changes serialization format of checkpoints). 2. On restore, confront configuration snapshots with newly registered serializers using the new `TypeSerializer#getMigrationStrategy(TypeSerializerConfigSnapshot)` method. 3. Serializer upgrades is completed if the confrontation determines that no migration is needed. The confrontation reconfigures the new serializer if the case requires. If the serializer cannot be reconfigured to avoid state migration, the job simply fails (as we currently do not have the actual state migration feature). Note that the confrontation of config snapshots is currently only performed in the `RocksDBKeyedStateBackend`, which is the only place where this is currently needed due to its lazy deserialization characteristic. After we have eager state migration in place, the confrontation should happen for all state backends on restore. ## Tests - Serialization compatibility of the new checkpoint format is covered with existing tests. - Added a test that makes sure `InvalidClassException` is also caught when deserializing old serializers in the checkpoint (which occurs if the old serializer implementation was changed and results in a new serialVersionUID). - Added tests for Java serialization failure resilience when reading the new checkpoints, in `SerializerProxiesTest`. - Added end-to-end snapshot + restore tests which require reconfiguration of the `KryoSerializer` and `PojoSerializer` in cases where registration order of Kryo classes / Pojo types were changed. You can merge this pull request into a Git repository by running: $ git pull https://github.com/tzulitai/flink FLINK-6425 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/3834.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #3834 ---- commit 538a7acecce0d72e36e3726c0df2b6b96be35fc3 Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org> Date: 2017-05-01T13:32:10Z [FLINK-6190] [core] Migratable TypeSerializers This commit introduces the user-facing APIs for migratable TypeSerializers. The new user-facing APIs are: - new class: TypeSerializerConfigSnapshot - new class: ForwardCompatibleSerializationFormatConfig - new method: TypeSerializer#snapshotConfiguration() - new method: TypeSerializer#reconfigure(TypeSerializerConfigSnapshot) - new enum: ReconfigureResult commit ed82173fe97c6e9fb0784696bc4c49f10cc4e556 Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org> Date: 2017-05-02T11:35:18Z [hotfix] [core] Catch InvalidClassException in TypeSerializerSerializationProxy Previously, the TypeSerializerSerializationProxy only uses the dummy ClassNotFoundDummyTypeSerializer as a placeholder in the case where the user uses a completely new serializer and deletes the old one. There is also the case where the user changes the original serializer's implementation and results in an InvalidClassException when trying to deserialize the serializer. We should also use the ClassNotFoundDummyTypeSerializer as a temporary placeholder in this case. commit e77096af29b4cbea26113928fe93218c075e4035 Author: Tzu-Li (Gordon) Tai <tzuli...@apache.org> Date: 2017-05-06T12:40:58Z [FLINK-6425] [runtime] Activate serializer upgrades in state backends This commit fully activates state serializer upgrades by changing the following: - Include serializer configuration snapshots in checkpoints - On restore, use configuration snapshots to confront new serializers to perform the upgrade ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---