Thoughts on coder evolution

Jan Lukavský Wed, 03 May 2023 06:58:00 -0700

Hi,

I'd like to discuss a topic, that from time to time appears in differentcontexts (e.g. [1]). I'd like restate the problem in a slightly moregeneric way as: "Should we have a way to completely exchange coder of aPCollection/state of a _running_ Pipeline?". First my motivation forthis question - Beam has an extension calledbeam-sdks-java-extensions-kryo, which contains a KryoCoder. This coderuses Kryo [2] to serialize virtually any Java class into binary format.Unfortunately, this binary representation differs between Kryo versionsand it does not contain any way to recognize which version of Kryo wasused to serialize the data. Attempt to deserialize bytes produced byincompatible version of Kryo results in an exception. The currentversion of Kryo that is used by the KryoCoder is already more than 5years old and upgrade to newer version is needed, because the currentversion does not work with JDK17+ [3]. Thus, the only option seems to bethe creation of a different Coder (e.g. Kryo5Coder), but then we needthe ability to transfer Pipelines using the old KryoCoder to the newerone. That is, we need to completely switch coder that encodesPCollection and/or state.


We have therefore the following options:

1) Simply ignore this and let users rerun the Pipeline from scratch.This is possible, essentially should be applicable, but if anythingelse, for some Pipelines it might be costly to reprocess all historicaldata.

2) We can create the new Coder and let users use a runner-specific wayto convert the Pipeline. E.g. in case of Flink, this could be done byconverting savepoint into the new format. This requires knowledge of howBeam stores state (namespaces) and is kind of involved on the user side.We could probably provide runner-specific tools for this, but somerunners, in general, might not allow such state manipulation.

3) We can include the information of a Coder update into the Pipelineand resubmit it to the runner and let the runner handle it. UponPipeline restart, a runner would have to convert all state and allinflight data from the old Coder to the new one, before resuming thePipeline.

Option 3) seems like the most natural, but it requires support on therunner side.

I leave the details on how a runner would do this open, I'm currentlyinterested in knowing what is the community's position on this.


 Jan

[1] https://lists.apache.org/thread/z2m1hg4l5k2kb7nhjkv2lnwf8g4t9wps

[2] https://github.com/EsotericSoftware/kryo

[3] https://github.com/EsotericSoftware/kryo/issues/885

Thoughts on coder evolution

Reply via email to