Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

Marios Trivyzas Mon, 29 Nov 2021 01:15:53 -0800

Hi Timo!

Thanks a lot for taking all that time and effort to put together this
proposal!


Regarding:
> For simplification of the design, we assume that upgrades use a step size
of a single
minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
1.14).

I think that for this first step we should make it absolutely clear to the
users that they would need to go through all intermediate versions
to end up with the target version they wish. If we are to support skipping
versions in the future, i.e. upgrade from 1.14 to 1.17, this means
that we need to have a testing infrastructure in place that would test all
possible combinations of version upgrades, i.e. from 1.14 to 1.15,
from 1.14 to 1.16 and so forth, while still testing and of course
supporting all the upgrades from the previous minor version.

I like a lot the idea of introducing HINTS to define some behaviour in the
programs!
- the hints live together with the sql statements and consequently the
(JSON) plans.
- If multiple queries are involved in a program, each one of them can
define its own config (regarding plan optimisation, not null enforcement,
etc)

I agree with Francesco on his argument regarding the *JSON* plan. I believe
we should already provide flexibility here, since (who knows) in the future
a JSON plan might not fulfil the desired functionality.

I also agree that we need some very obvious way (i.e. not log entry) to
show the users that their program doesn't support version upgrades, and
prevent them from being negatively surprised in the future, when trying to
upgrade their production pipelines.

This is an implementation detail, but I'd like to add that there should be
some good logging in place when the upgrade is taking place, to be
able to track every restoration action, and help debug any potential issues
arising from that.





On Fri, Nov 26, 2021 at 2:54 PM Till Rohrmann <trohrm...@apache.org> wrote:

> Thanks for writing this FLIP Timo. I think this will be a very important
> improvement for Flink and our SQL user :-)
>
> Similar to Francesco I would like to understand the statement
>
> > For simplification of the design, we assume that upgrades use a step size
> of a single
> minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
> 1.14).
>
> a bit better. Is it because Flink does not guarantee that a savepoint
> created by version 1.x can be directly recovered by version 1.y with x + 1
> < y but users might have to go through a cascade of upgrades? From how I
> understand your proposal, the compiled plan won't be changed after being
> written initially. Hence, I would assume that for the plan alone Flink will
> have to give backwards compatibility guarantees for all versions. Am I
> understanding this part correctly?
>
> Cheers,
> Till
>
> On Thu, Nov 25, 2021 at 4:55 PM Francesco Guardiani <
> france...@ververica.com>
> wrote:
>
> > Hi Timo,
> >
> > Thanks for putting this amazing work together, I have some
> > considerations/questions
> > about the FLIP:
> > *Proposed changes #6*: Other than defining this rule of thumb, we must
> > also make sure
> > that compiling plans with these objects that cannot be serialized in the
> > plan must fail hard,
> > so users don't bite themselves with such issues, or at least we need to
> > output warning
> > logs. In general, whenever the user is trying to use the CompiledPlan
> APIs
> > and at the same
> > time, they're trying to do something "illegal" for the plan, we should
> > immediately either
> > log or fail depending on the issue, in order to avoid any surprises once
> > the user upgrades.
> > I would also say the same for things like registering a function,
> > registering a DataStream,
> > and for every other thing which won't end up in the plan, we should log
> > such info to the
> > user by default.
> >
> > *General JSON Plan Assumptions #9:* When thinking to connectors and
> > formats, I think
> > it's reasonable to assume and keep out of the feature design that no
> > feature/ability can
> > deleted from a connector/format. I also don't think new
> features/abilities
> > can influence
> > this FLIP as well, given the plan is static, so if for example,
> > MyCoolTableSink in the next
> > flink version implements SupportsProjectionsPushDown, then it shouldn't
> be
> > a problem
> > for the upgrade story since the plan is still configured as computed from
> > the previous flink
> > version. What worries me is breaking changes, in particular behavioural
> > changes that
> > might happen in connectors/formats. Although this argument doesn't seem
> > relevant for
> > the connectors shipped by the flink project itself, because we try to
> keep
> > them as stable as
> > possible and avoid eventual breaking changes, it's compelling to external
> > connectors and
> > formats, which might be decoupled from the flink release cycle and might
> > have different
> > backward compatibility guarantees. It's totally reasonable if we don't
> > want to tackle it in
> > this first iteration of the feature, but it's something we need to keep
> in
> > mind for the future.
> >
> >
> > *Functions:* It's not clear to me what you mean for "identifier", because
> > then somewhere
> > else in the same context you talk about "name". Are we talking about the
> > function name
> > or the function complete signature? Let's assume for example we have
> these
> > function
> > definitions:
> >
> >
> > * TO_TIMESTAMP_LTZ(BIGINT)
> > * TO_TIMESTAMP_LTZ(STRING)
> > * TO_TIMESTAMP_LTZ(STRING, STRING)
> >
> > These for me are very different functions with different implementations,
> > where each of
> > them might evolve separately at a different pace. Hence when we store
> them
> > in the json
> > plan we should perhaps use a logically defined unique id like
> > /bigIntToTimestamp/, /
> > stringToTimestamp/ and /stringToTimestampWithFormat/. This also solves
> the
> > issue of
> > correctly referencing the functions when restoring the plan, without
> > running again the
> > inference logic (which might have been changed in the meantime) and it
> > might also solve
> > the versioning, that is the function identifier can contain the function
> > version like /
> > stringToTimestampWithFormat_1_1 /or /stringToTimestampWithFormat_1_2/. An
> > alternative could be to use the string signature representation, which
> > might not be trivial
> > to compute, given the complexity of our type inference logic.
> >
> > *The term "JSON plan"*: I think we should rather keep JSON out of the
> > concept and just
> > name it "Compiled Plan" (like the proposed API) or something similar, as
> I
> > see how in
> > future we might decide to support/modify our persistence format to
> > something more
> > efficient storage wise like BSON. For example, I would rename /
> > CompiledPlan.fromJsonFile/ to simply /CompiledPlan.fromFile/.
> >
> > *Who is the owner of the plan file?* I asked myself this question when
> > reading this:
> >
> >
> > > For simplification of the design, we assume that upgrades use a step
> > size of a single
> > minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
> > 1.14).
> >
> > My understanding of this statement is that a user can upgrade between
> > minors but then
> > following all the minors, the same query can remain up and running. E.g.
> I
> > upgrade from
> > 1.15 to 1.16, and then from 1.16 to 1.17 and I still expect my original
> > query to work
> > without recomputing the plan. This necessarily means that at some point
> in
> > future
> > releases we'll need some basic "migration" tool to keep the queries up
> and
> > running,
> > ending up modifying the compiled plan. So I guess flink should write it
> > back in the original
> > plan file, perhaps doing a backup of the previous one? Can you please
> > clarify this aspect?
> >
> > Except these considerations, the proposal looks good to me and I'm
> eagerly
> > waiting to see
> > it in play.
> >
> > Thanks,
> > FG
> >
> >
> > --
> > Francesco Guardiani | Software Engineer
> > france...@ververica.com[1]
> >
> > Follow us @VervericaData
> > --
> > Join Flink Forward[2] - The Apache Flink Conference
> > Stream Processing | Event Driven | Real Time
> > --
> > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> > --
> > Ververica GmbH
> > Registered at Amtsgericht Charlottenburg: HRB 158244 B
> > Managing Directors: Karl Anton Wehner, Holger Temme, Yip Park Tung Jason,
> > Jinwei (Kevin)
> > Zhang
> >
> > --------
> > [1] mailto:france...@ververica.com
> > [2] https://flink-forward.org/
> >
>


-- 
Marios

Re: [DISCUSS] FLIP-190: Support Version Upgrades for Table API & SQL Programs

Reply via email to