Hi all,

I was reviewing and realised from pull request 26444 [1] which is to do with 
removing data associated with eversion state for unit tests. I can look into 
creating a migration tool and have copied in some of the people who have been 
involved with this for feedback / thoughts on the approach / considerations I 
have missed. As part of this I thought I would bring together what I am seeing:



  1.  that the Kryo v2 to v5 Flip [2] contained a migration section. The Flip 
is still being discussed and has not been accepted, but Kryo has been upgraded. 
@guowei...@apache.org<mailto:guowei...@apache.org> @Xintong 
Song<mailto:tonysong...@gmail.com> 
@zake...@apache.org<mailto:zake...@apache.org> @Chen 
Zhanghao<mailto:zhanghao.c...@outlook.com> @Martijn 
Visser<mailto:martijnvis...@apache.org>
  2.  It seems that the associated discussion[3] mentioned that a migration 
tool is desirable by many parts of the community.
  3.  I see another issue [4] discussing this. Interestingly there was an 
attempt to maintain v2 and v5 compatibility, but a simpler approach was taken 
by @Kurt Ostfeld<mailto:kurtostf...@proton.me.INVALID> in the light of Flink v2 
that no longer required the backward compatibility.
  4.  There was a dev discussion [5] around this also. I learnt that Kryo v2 is 
built with java 8. @ches...@apache.org<mailto:ches...@apache.org>
  5.  There is the v5 Kryo migration guide [6].

>From a quick look at the code I see

  1.  The main serialization appears to be in the KryoSerializer [8] which 
serializes a type.
  2.  I see TypeSerializer [9] does readSnapshot called from Flink core and 
Avro. For Kryo this would be implemented by KryoSerializerSnapshot.java [10]
  3.  So a migration utility could issue a readSnapshot from v1 using Kryo 2 
which creates a by KryoSerializerSnapshotData. We would need to amend the 
snapshot in memory to change the version and replace the serializers as / if 
required and then issue a writeSnapshot using Kryo 5.



>From the above it seems that a migration tool would need

  *   some Kryo state to test against the v1.20 junit reference state data 
could work
  *   look to follow the Kryo migration guide or work at the Flink snapshot 
level. The Snapshot level may help deal with special cases (like the Flink 
rewriting of the Kryo JavaSerializer); hopefully avoiding the need to have a 
Fkink v1 and a Flink v2 version of this class.
  *   Kryo 5.6.2 was recompiled to support java 8 [7] but should be able to run 
with the higher java levels..
  *   Decide which repo it needs to be in. I was thinking a sensible place to 
prototype is master, where I could use the up to date code , rename the Kryo 
5.6.2 dependancy to use kryo5 – so we would have different class paths to kryo 
v2. Bring in Kryo v2 as a test dependency to test whether a migration could 
work in unit tests.
  *   test the old and new snapshots are functionally equivalent using the 
v1.20 (Kryo 2) state data in the unit tests  – migrate to v2 (Kryo v5) and 
compare with a Kryo v5 to Kryo v5 snapshot.







Kind regards, David.

[1] https://github.com/apache/flink/pull/26444

[2] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-317%3A+Upgrade+Kryo+from+2.24.0+to+5.5.0

[3] https://lists.apache.org/thread/odhglx8tmpdt6jnorgcsvxjqjfd169x6

[4] https://issues.apache.org/jira/browse/FLINK-3154

[5] https://lists.apache.org/thread/8zt76ftmfgprf4thmtwws87t1jnn0tg4

[6] https://github.com/EsotericSoftware/kryo/wiki/Migration-to-v5

[7] https://github.com/EsotericSoftware/kryo/releases/tag/kryo-parent-5.6.2

[8] 
https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializer.java#L98
[9] 
https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializerSnapshot.java#L103
[10] 
https://github.com/apache/flink/blob/c724168fad4215626b5596dd63cb66e477948aa0/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializerSnapshot.java#L41

Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Reply via email to