Hi All,

We recently announced the availability of the Beam Transform Service [1].
One of the features that this will allow us to do is upgrading transforms
of pipelines to new Beam versions without upgrading the full pipeline [2].

I authored a PoC PR that addresses this for Java SDK:
https://github.com/apache/beam/pull/28210

This introduces new pipeline options (in ExternalTranslationOptions) that
allows Beam users to exactly control which transforms to upgrade and to
which Beam version. More specifically,

* transformsToOverride: this accepts a list of URNs that uniquely
identifies the transforms to upgrade.
* transformServiceBeamVersion: this takes a new Beam version.
Implementation will automatically startup a transform service for this Beam
version and will upgrade the transforms identified in the
'transformsToOverride' option to this version.

To implement this, I'm extending the existing "TransformPayloadTranslator"
[3] interface so that transform construction can be performed using a
construction schema (this is partially what schema-aware transforms already
do but this allows us to upgrade existing transforms that do not take
PCollection<Row> as input and output).

Please take a look and let me know if you have any comments (here or in the
PR).

Thanks,
Cham

[1] https://lists.apache.org/thread/j0bhcsn7dvdv4wch5rb1z1qbnxmt70r9
[2] https://github.com/apache/beam/issues/27943
[3]
https://github.com/apache/beam/pull/28210/files#diff-58ff54e017947d68975c0c1ce419545c500112afe9b6718b2f5935cb971702dbL512

Reply via email to