Hi all, I was reviewing and realised from pull request 26444 [1] which is to do with removing data associated with eversion state for unit tests. I can look into creating a migration tool and have copied in some of the people who have been involved with this for feedback / thoughts on the approach / considerations I have missed. As part of this I thought I would bring together what I am seeing:
1. that the Kryo v2 to v5 Flip [2] contained a migration section. The Flip is still being discussed and has not been accepted, but Kryo has been upgraded. @guowei...@apache.org<mailto:guowei...@apache.org> @Xintong Song<mailto:tonysong...@gmail.com> @zake...@apache.org<mailto:zake...@apache.org> @Chen Zhanghao<mailto:zhanghao.c...@outlook.com> @Martijn Visser<mailto:martijnvis...@apache.org> 2. It seems that the associated discussion[3] mentioned that a migration tool is desirable by many parts of the community. 3. I see another issue [4] discussing this. Interestingly there was an attempt to maintain v2 and v5 compatibility, but a simpler approach was taken by @Kurt Ostfeld<mailto:kurtostf...@proton.me.INVALID> in the light of Flink v2 that no longer required the backward compatibility. 4. There was a dev discussion [5] around this also. I learnt that Kryo v2 is built with java 8. @ches...@apache.org<mailto:ches...@apache.org> 5. There is the v5 Kryo migration guide [6]. >From a quick look at the code I see 1. The main serialization appears to be in the KryoSerializer [8] which serializes a type. 2. I see TypeSerializer [9] does readSnapshot called from Flink core and Avro. For Kryo this would be implemented by KryoSerializerSnapshot.java [10] 3. So a migration utility could issue a readSnapshot from v1 using Kryo 2 which creates a by KryoSerializerSnapshotData. We would need to amend the snapshot in memory to change the version and replace the serializers as / if required and then issue a writeSnapshot using Kryo 5. >From the above it seems that a migration tool would need * some Kryo state to test against the v1.20 junit reference state data could work * look to follow the Kryo migration guide or work at the Flink snapshot level. The Snapshot level may help deal with special cases (like the Flink rewriting of the Kryo JavaSerializer); hopefully avoiding the need to have a Fkink v1 and a Flink v2 version of this class. * Kryo 5.6.2 was recompiled to support java 8 [7] but should be able to run with the higher java levels.. * Decide which repo it needs to be in. I was thinking a sensible place to prototype is master, where I could use the up to date code , rename the Kryo 5.6.2 dependancy to use kryo5 – so we would have different class paths to kryo v2. Bring in Kryo v2 as a test dependency to test whether a migration could work in unit tests. * test the old and new snapshots are functionally equivalent using the v1.20 (Kryo 2) state data in the unit tests – migrate to v2 (Kryo v5) and compare with a Kryo v5 to Kryo v5 snapshot. Kind regards, David. [1] https://github.com/apache/flink/pull/26444 [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0 [3] https://lists.apache.org/thread/odhglx8tmpdt6jnorgcsvxjqjfd169x6 [4] https://issues.apache.org/jira/browse/FLINK-3154 [5] https://lists.apache.org/thread/8zt76ftmfgprf4thmtwws87t1jnn0tg4 [6] https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5 [7] https://github.com/EsotericSoftware/kryo/releases/tag/kryo-parent-5.6.2 [8] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializer.java#L98 [9] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerSnapshot.java#L103 [10] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializerSnapshot.java#L41 Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN