Hi, Quick update, from a quick play - using KryoSerializerSnapshot is not going to work as I suggested below. I did not realise how many parts of Flink depend on Kryo, there is no easy way I can see to have v2 and v5 co existing in the Flink repo without a large refactor. For example I see that the Kryo Serializer class has 160 inheritors in Flink.
I am going to attempt Chesnay’s suggestion in [1] to: “However, both versions can be on the classpath without classpath as v5 offers a versioned artifact that includes the version in the package. It probably wouldn't be difficult to migrate a savepoint to Kryo v5, purely from a read/write perspective.” In this post Chesnay talks of the bigger question: “The bigger question is how we expose this new Kryo version in the API. If we stick to the versioned jar we need to either duplicate all current Kryo-related APIs or find a better way to integrate other serialization” As we are Flink v2 now – we do not need to deal with the bigger question – just migrate the save points. Kind regards, David. [1] https://lists.apache.org/thread/8zt76ftmfgprf4thmtwws87t1jnn0tg4 From: David Radley <david_rad...@uk.ibm.com> Date: Wednesday, 23 April 2025 at 16:44 To: dev@flink.apache.org <dev@flink.apache.org> Cc: ches...@apache.org <ches...@apache.org>, guowei...@apache.org <guowei...@apache.org>, Xintong Song <tonysong...@gmail.com>, zake...@apache.org <zake...@apache.org>, Chen Zhanghao <zhanghao.c...@outlook.com>, Martijn Visser <martijnvis...@apache.org> Subject: [EXTERNAL] FW: Kryo v2 to v5 migration tool Hi all, I was reviewing and realised from pull request 26444 [1] which is to do with removing data associated with eversion state for unit tests. I can look into creating a migration tool and have copied in some of the people who have been involved with this for feedback / thoughts on the approach / considerations I have missed. As part of this I thought I would bring together what I am seeing: 1. that the Kryo v2 to v5 Flip [2] contained a migration section. The Flip is still being discussed and has not been accepted, but Kryo has been upgraded. @guowei...@apache.org<mailto:guowei...@apache.org> @Xintong Song<mailto:tonysong...@gmail.com> @zake...@apache.org<mailto:zake...@apache.org> @Chen Zhanghao<mailto:zhanghao.c...@outlook.com> @Martijn Visser<mailto:martijnvis...@apache.org> 2. It seems that the associated discussion[3] mentioned that a migration tool is desirable by many parts of the community. 3. I see another issue [4] discussing this. Interestingly there was an attempt to maintain v2 and v5 compatibility, but a simpler approach was taken by @Kurt Ostfeld<mailto:kurtostf...@proton.me.INVALID> in the light of Flink v2 that no longer required the backward compatibility. 4. There was a dev discussion [5] around this also. I learnt that Kryo v2 is built with java 8. @ches...@apache.org<mailto:ches...@apache.org> 5. There is the v5 Kryo migration guide [6]. >From a quick look at the code I see 1. The main serialization appears to be in the KryoSerializer [8] which serializes a type. 2. I see TypeSerializer [9] does readSnapshot called from Flink core and Avro. For Kryo this would be implemented by KryoSerializerSnapshot.java [10] 3. So a migration utility could issue a readSnapshot from v1 using Kryo 2 which creates a by KryoSerializerSnapshotData. We would need to amend the snapshot in memory to change the version and replace the serializers as / if required and then issue a writeSnapshot using Kryo 5. >From the above it seems that a migration tool would need * some Kryo state to test against the v1.20 junit reference state data could work * look to follow the Kryo migration guide or work at the Flink snapshot level. The Snapshot level may help deal with special cases (like the Flink rewriting of the Kryo JavaSerializer); hopefully avoiding the need to have a Fkink v1 and a Flink v2 version of this class. * Kryo 5.6.2 was recompiled to support java 8 [7] but should be able to run with the higher java levels.. * Decide which repo it needs to be in. I was thinking a sensible place to prototype is master, where I could use the up to date code , rename the Kryo 5.6.2 dependancy to use kryo5 – so we would have different class paths to kryo v2. Bring in Kryo v2 as a test dependency to test whether a migration could work in unit tests. * test the old and new snapshots are functionally equivalent using the v1.20 (Kryo 2) state data in the unit tests – migrate to v2 (Kryo v5) and compare with a Kryo v5 to Kryo v5 snapshot. Kind regards, David. [1] https://github.com/apache/flink/pull/26444 [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0 [3] https://lists.apache.org/thread/odhglx8tmpdt6jnorgcsvxjqjfd169x6 [4] https://issues.apache.org/jira/browse/FLINK-3154 [5] https://lists.apache.org/thread/8zt76ftmfgprf4thmtwws87t1jnn0tg4 [6] https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5 [7] https://github.com/EsotericSoftware/kryo/releases/tag/kryo-parent-5.6.2 [8] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializer.java#L98 [9] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerSnapshot.java#L103 [10] https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializerSnapshot.java#L41 Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN Unless otherwise stated above: IBM United Kingdom Limited Registered in England and Wales with number 741598 Registered office: Building C, IBM Hursley Office, Hursley Park Road, Winchester, Hampshire SO21 2JN