Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

Stephan Ewen Thu, 17 Aug 2017 09:30:11 -0700

@Greg - One benefit I can clearly see is the following:

If we keep that old 1.1-style state code, than we want to guarantee its
correctness in the face of the changes that have been made (consolidate
state code to be per-operator rather than per-task in the runtime as well)
and the changes that are WIP, for example for state evolution (like eager
state) or a better failover (don't reload state from DFS if the node did
not actually crash).
Guaranteeing that correctness is a lot of work. I think that not ensuring
the correctness as thoroughly would be worse that removing that code.


The CEP library had not really reached stable state in 1.2 anyways, it was
broken and reworked to introduce features like 'loops' or quantifiers. So
on that side, there would be no 1.2 compatibility anyways.

Do you see any concrete case were dropping 1.1 compatibility breaks any
setups? I personally know no user that is still on 1.1, most were very
eager about 1.2 due to rescalable state.

As someone that has also worked on the state code, I completely understand
Stefan's desire to simplify things there - it really slows down
developments in that component big time right now...


On Thu, Aug 17, 2017 at 1:34 PM, Stefan Richter <[email protected]
> wrote:

> I think we are still doing changes for which this is relevant. Also I
> cannot really see a benefit in delaying this because the whole discussion
> will apply in exactly the same way to 1.5.
>
> > Am 17.08.2017 um 13:29 schrieb Greg Hogan <[email protected]>:
> >
> > There’s an argument for delaying this change to 1.5 since the feature
> freeze is two weeks away. There is little time to realize benefits from
> removing this code.
> >
> > "The reason for that is that there is a lot of code mapping between the
> completely different legacy format (1.1.x, not re-scalable) and the
> key-group-oriented format (1.2.x onwards, re-scalable). It would greatly
> help the development of state and checkpointing features to drop that old
> code.”
> >
> > Greg
> >
> >
> >> On Aug 17, 2017, at 5:36 AM, Stefan Richter <
> [email protected]> wrote:
> >>
> >> One more comment about the consequences of this PR, as pointed out in
> the comments on Github: this will also break direct compatibility for the
> CEP library between Flink 1.2 and 1.4. There is still a way to migrate via
> Flink 1.3: Flink 1.1/2 -> savepoint -> Flink 1.3 -> savepoint -> Flink 1.4.
> >>
> >>> Am 16.08.2017 um 17:31 schrieb Stefan Richter <
> [email protected]>:
> >>>
> >>> Hi,
> >>>
> >>> after there have been no objections since a long time, I took the next
> step and created a PR that implements this change in commit
> 95e44099784c9deaf2ca422b8dfc11c3d67d7f82 of https://github.com/apache/
> flink/pull/4550 <https://github.com/apache/flink/pull/4550> . Announcing
> this here as a last opportunity for further discussions. FYI, this will
> decrease the code base by almost 12K LOC.
> >>>
> >>> Best,
> >>> Stefan
> >>>
> >>>
> >>>> Am 02.08.2017 um 15:26 schrieb Kostas Kloudas <
> [email protected] <mailto:[email protected]>>:
> >>>>
> >>>> +1
> >>>>
> >>>>> On Aug 2, 2017, at 3:16 PM, Till Rohrmann <[email protected]
> <mailto:[email protected]>> wrote:
> >>>>>
> >>>>> +1
> >>>>>
> >>>>> On Wed, Aug 2, 2017 at 9:12 AM, Stefan Richter <
> [email protected] <mailto:[email protected]>>
> >>>>> wrote:
> >>>>>
> >>>>>> +1
> >>>>>>
> >>>>>> Am 28.07.2017 um 16:03 schrieb Stephan Ewen <[email protected]
> <mailto:[email protected]>>:
> >>>>>>
> >>>>>> Seems like no one raised a concern so far about dropping the
> savepoint
> >>>>>> format compatibility for 1.1 in 1.4.
> >>>>>>
> >>>>>> Leaving this thread open for some more days, but from the
> sentiment, it
> >>>>>> seems like we should go ahead?
> >>>>>>
> >>>>>> On Wed, Jul 12, 2017 at 4:43 PM, Stephan Ewen <[email protected]
> <mailto:[email protected]>> wrote:
> >>>>>>
> >>>>>>> Hi users!
> >>>>>>>
> >>>>>>> Flink currently maintains backwards compatibility for savepoint
> formats,
> >>>>>>> which means that savepoints taken with Flink version 1.1.x and
> 1.2.x can be
> >>>>>>> resumed in Flink 1.3.x
> >>>>>>>
> >>>>>>> We are discussing how many versions back to support. The
> proposition is
> >>>>>>> the following:
> >>>>>>>
> >>>>>>> *   Suggestion: Flink 1.4.0 will be able to resume savepoints
> taken with
> >>>>>>> version 1.3.x and 1.2.x, but not savepoints from version 1.1.x and
> 1.0.x*
> >>>>>>>
> >>>>>>>
> >>>>>>> The reason for that is that there is a lot of code mapping between
> the
> >>>>>>> completely different legacy format (1.1.x, not re-scalable) and the
> >>>>>>> key-group-oriented format (1.2.x onwards, re-scalable). It would
> greatly
> >>>>>>> help the development of state and checkpointing features to drop
> that old
> >>>>>>> code.
> >>>>>>>
> >>>>>>> Please let us know if you have concerns about that.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Stephan
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>
> >>
> >
>
>

Re: [POLL] Dropping savepoint format compatibility for 1.1.x in the Flink 1.4.0 release

Reply via email to