+1 million to this. I think this could be a real game-changer. I would even more forcefully say update compatibility has pushed our development style has been pushed into the "never make significant changes" or "every significant change is wildly more complex than it should be". It forces our first draft to be our final draft, much moreso than abstraction-based backwards-compatibility, because it requires freezing many implementation details as well.
And just to put more non-subjective data behind my +1, I have used this approach many times in situations where a new version of a service rolled out while still serving older clients (using URL as the flag). It is a tried-and-true technique and connecting it to Beam is like an epiphany. Hooray! The easiest way to ensure clean code is to make older versions more like straight line code, flattening out cyclomatic complexity by forking transforms at the top level. In other words FooIO.read() immediately delegates to FooIO_2_48.read(). You shouldn't be checking this flag at a bunch of separate places inside an IO. In fact I might say that should be largely forbidden and it should only be used as a "routing" flag. Kenn On Wed, Oct 25, 2023 at 8:25 PM Robert Bradshaw via dev <dev@beam.apache.org> wrote: > Dataflow (among other runners) has the ability to "upgrade" running > pipelines with new code (e.g. capturing bug fixes, dependency updates, > and limited topology changes). Unfortunately some improvements (e.g. > new and improved ways of writing to BigQuery, optimized use of side > inputs, a change in algorithm, sometimes completely internally and not > visible to the user) are not sufficiently backwards compatible which > causes us, with the motivation to not break users, to either not make > these changes or guard them as a parallel opt-in mode which is a > significant drain on both developer productivity and causes new > pipelines to run in obsolete modes by default. > > I created https://github.com/apache/beam/pull/29140 which adds a new > pipeline option, update_compatibility_version, that allows the SDK to > move forward while letting users with pipelines launched previously to > manually request the "old" way of doing things to preserve update > compatibility. (We should still attempt backwards compatibility when > it makes sense, and the old way would remain in code until such a time > it's actually deprecated and removed, but this means we won't be > constrained by it, especially when it comes to default settings.) > > Any objections or other thoughts on this approach? > > - Robert > > P.S. Separately I think it'd be valuable to elevate the vague notion > of update compatibility to a first-class Beam concept and put it on > firm footing, but that's a larger conversation outside the thread of > this smaller (and I think still useful in such a future world) change. >