My concern was that it changes Flink's behaviour in a non trivial way which could break existing setups. It is true that Flink's exactly once guarantees are thwarted by that, though.
The only problematic case is when users delete the savepoint files without the corresponding meta state savepoint file. In this case, Flink would not be able to recover, right? This might indeed be only an academic case. I've only reverted FLINK-10247 so far. Cheers, Till On Tue, Oct 16, 2018 at 5:46 PM Aljoscha Krettek <aljos...@apache.org> wrote: > Hi, > > I think the "savepoints for recovery" change is a fix for likely violation > of exactly-once guarantees while it has close to zero downsides in > real-world use cases. Therefore I think we should not revert it and release > the fix for 1.5.5 and 1.6.2. > > Best, > Aljoscha > > > On 16. Oct 2018, at 13:48, Chesnay Schepler <ches...@apache.org> wrote: > > > > I agree that these change the behavior of Flink and could cause issues > for users, hence I would be in favor of reverting said changes. > > > > On 16.10.2018 13:46, Till Rohrmann wrote: > >> Sorry for the late notification, but I think we have some changes in > the release branches which we might consider for reverting. > >> > >> 1) Savepoints being considered for recovery [1] > >> > >> The problem is that we change Flink's savepoint contract in the sense > that savepoint's are no longer exclusively under the control of the user. > >> > >> 2) Run metric's query service in separate actor system [2] > >> > >> The problem is that we start a new ActorSystem for the > MetricQueryService which needs another port being opened to communicate. > >> > >> What do you think? > >> > >> [1] https://issues.apache.org/jira/browse/FLINK-10354 > >> [2] https://issues.apache.org/jira/browse/FLINK-10247 > >> > >> Cheers, > >> Till > >> > >> On Tue, Oct 16, 2018 at 8:52 AM Chesnay Schepler <ches...@apache.org > <mailto:ches...@apache.org> <mailto:ches...@apache.org <mailto: > ches...@apache.org>>> wrote: > >> > >> My issues have been merged as well. > >> > >> I will cut the release branches in 3 hours. > >> > >> On 15.10.2018 22:11, Till Rohrmann wrote: > >> > FLINK-9932 has been merged. > >> > > >> > Cheers, > >> > Till > >> > > >> > On Mon, Oct 15, 2018 at 1:19 PM Till Rohrmann > >> <trohrm...@apache.org <mailto:trohrm...@apache.org> <mailto: > trohrm...@apache.org <mailto:trohrm...@apache.org>>> wrote: > >> > > >> >> Thanks a lot for starting this discussion Chesnay. I fully > >> agree that a > >> >> new bug fix release would be justified. > >> >> > >> >> I'm currently working on FLINK-9932 which I would like to > >> include in the > >> >> next bug fix release. It should be done by the end of today. > >> >> > >> >> Cheers, > >> >> Till > >> >> > >> >> On Mon, Oct 15, 2018 at 11:40 AM Chesnay Schepler > >> <ches...@apache.org <mailto:ches...@apache.org> <mailto: > ches...@apache.org <mailto:ches...@apache.org>>> > >> >> wrote: > >> >> > >> >>> Hello, > >> >>> > >> >>> we've accumulated various fixes for 1.5.5 (24) and 1.6.2 (37) > that > >> >>> improve stability by quite a bit along with some neat > >> usability fixes. > >> >>> > >> >>> I'm proposing to do the next bugfix releases soon (I suggest > >> tomorrow as > >> >>> a tentative vote date), and volunteer to handle the release > >> process for > >> >>> both of them. > >> >>> > >> >>> There are some back-ports that I myself want to get in first > >> >>> (FLINK-10282, FLINK-10075, FLINK-10135) but this should be > >> done by today. > >> >>> > >> >>> > >> >>> Regards, > >> >>> > >> >>> Chesnay > >> >>> > >> >>> > >