Thanks for writing this FLIP Timo. I think this will be a very important
improvement for Flink and our SQL user :-)

Similar to Francesco I would like to understand the statement

> For simplification of the design, we assume that upgrades use a step size
of a single
minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
1.14).

a bit better. Is it because Flink does not guarantee that a savepoint
created by version 1.x can be directly recovered by version 1.y with x + 1
< y but users might have to go through a cascade of upgrades? From how I
understand your proposal, the compiled plan won't be changed after being
written initially. Hence, I would assume that for the plan alone Flink will
have to give backwards compatibility guarantees for all versions. Am I
understanding this part correctly?

Cheers,
Till

On Thu, Nov 25, 2021 at 4:55 PM Francesco Guardiani <france...@ververica.com>
wrote:

> Hi Timo,
>
> Thanks for putting this amazing work together, I have some
> considerations/questions
> about the FLIP:
> *Proposed changes #6*: Other than defining this rule of thumb, we must
> also make sure
> that compiling plans with these objects that cannot be serialized in the
> plan must fail hard,
> so users don't bite themselves with such issues, or at least we need to
> output warning
> logs. In general, whenever the user is trying to use the CompiledPlan APIs
> and at the same
> time, they're trying to do something "illegal" for the plan, we should
> immediately either
> log or fail depending on the issue, in order to avoid any surprises once
> the user upgrades.
> I would also say the same for things like registering a function,
> registering a DataStream,
> and for every other thing which won't end up in the plan, we should log
> such info to the
> user by default.
>
> *General JSON Plan Assumptions #9:* When thinking to connectors and
> formats, I think
> it's reasonable to assume and keep out of the feature design that no
> feature/ability can
> deleted from a connector/format. I also don't think new features/abilities
> can influence
> this FLIP as well, given the plan is static, so if for example,
> MyCoolTableSink in the next
> flink version implements SupportsProjectionsPushDown, then it shouldn't be
> a problem
> for the upgrade story since the plan is still configured as computed from
> the previous flink
> version. What worries me is breaking changes, in particular behavioural
> changes that
> might happen in connectors/formats. Although this argument doesn't seem
> relevant for
> the connectors shipped by the flink project itself, because we try to keep
> them as stable as
> possible and avoid eventual breaking changes, it's compelling to external
> connectors and
> formats, which might be decoupled from the flink release cycle and might
> have different
> backward compatibility guarantees. It's totally reasonable if we don't
> want to tackle it in
> this first iteration of the feature, but it's something we need to keep in
> mind for the future.
>
>
> *Functions:* It's not clear to me what you mean for "identifier", because
> then somewhere
> else in the same context you talk about "name". Are we talking about the
> function name
> or the function complete signature? Let's assume for example we have these
> function
> definitions:
>
>
> * TO_TIMESTAMP_LTZ(BIGINT)
> * TO_TIMESTAMP_LTZ(STRING)
> * TO_TIMESTAMP_LTZ(STRING, STRING)
>
> These for me are very different functions with different implementations,
> where each of
> them might evolve separately at a different pace. Hence when we store them
> in the json
> plan we should perhaps use a logically defined unique id like
> /bigIntToTimestamp/, /
> stringToTimestamp/ and /stringToTimestampWithFormat/. This also solves the
> issue of
> correctly referencing the functions when restoring the plan, without
> running again the
> inference logic (which might have been changed in the meantime) and it
> might also solve
> the versioning, that is the function identifier can contain the function
> version like /
> stringToTimestampWithFormat_1_1 /or /stringToTimestampWithFormat_1_2/. An
> alternative could be to use the string signature representation, which
> might not be trivial
> to compute, given the complexity of our type inference logic.
>
> *The term "JSON plan"*: I think we should rather keep JSON out of the
> concept and just
> name it "Compiled Plan" (like the proposed API) or something similar, as I
> see how in
> future we might decide to support/modify our persistence format to
> something more
> efficient storage wise like BSON. For example, I would rename /
> CompiledPlan.fromJsonFile/ to simply /CompiledPlan.fromFile/.
>
> *Who is the owner of the plan file?* I asked myself this question when
> reading this:
>
>
> > For simplification of the design, we assume that upgrades use a step
> size of a single
> minor version. We don't guarantee skipping minor versions (e.g. 1.11 to
> 1.14).
>
> My understanding of this statement is that a user can upgrade between
> minors but then
> following all the minors, the same query can remain up and running. E.g. I
> upgrade from
> 1.15 to 1.16, and then from 1.16 to 1.17 and I still expect my original
> query to work
> without recomputing the plan. This necessarily means that at some point in
> future
> releases we'll need some basic "migration" tool to keep the queries up and
> running,
> ending up modifying the compiled plan. So I guess flink should write it
> back in the original
> plan file, perhaps doing a backup of the previous one? Can you please
> clarify this aspect?
>
> Except these considerations, the proposal looks good to me and I'm eagerly
> waiting to see
> it in play.
>
> Thanks,
> FG
>
>
> --
> Francesco Guardiani | Software Engineer
> france...@ververica.com[1]
>
> Follow us @VervericaData
> --
> Join Flink Forward[2] - The Apache Flink Conference
> Stream Processing | Event Driven | Real Time
> --
> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany
> --
> Ververica GmbH
> Registered at Amtsgericht Charlottenburg: HRB 158244 B
> Managing Directors: Karl Anton Wehner, Holger Temme, Yip Park Tung Jason,
> Jinwei (Kevin)
> Zhang
>
> --------
> [1] mailto:france...@ververica.com
> [2] https://flink-forward.org/
>

Reply via email to