+1 I think it could make sense to backport my safety net PR https://github.com/apache/flink/pull/2691for <https://github.com/apache/flink/pull/2691for> 1.1.4. The changes are pretty much isolated and it could help a lot about resource leaks and task cancelation times.
Best, Stefan > Am 26.10.2016 um 07:05 schrieb Jean-Baptiste Onofré <j...@nanthrax.net>: > > +1 > > Looking forward this release ! > > Regards > JB > > > > On Oct 25, 2016, 14:43, at 14:43, Robert Metzger <rmetz...@apache.org> wrote: >> +1 for a bugfix release soon. >> >> On Tue, Oct 25, 2016 at 10:53 AM, Stephan Ewen <se...@apache.org> >> wrote: >> >>> Thanks fort starting this Ufuk. >>> >>> I would like to add the following issues to 1.1.4: >>> >>> Build errors due to Storm dependencies *(fix pending)* >>> - [FLINK-4298] [storm compatibility] Add proper repository for >> Closure >>> dependencies. >>> >>> Stability on S3 considering eventual consistency *(fix pending)* >>> - [FLINK-4218] [checkpoints] Do not fail checkpoints when state >> size >>> cannot be determined >>> >>> Avoiding Zombie TaskManagers *(still needs to be done)* >>> - [FLINK-3347] [akka] TaskManager (or its ActorSystem) need to >> restart >>> in case they notice quarantine >>> >>> Adding a limit to the amount of data spilled during checkpoint >> alignments >>> *(fix >>> is work in progress)* >>> - [FLINK-4904] [checkpoints] Add a limit for how much data may be >>> spilled in checkpoint alignments >>> >>> >>> I can push the first two fixes to the 1.1.4 branch in a bit, the >> fourth one >>> later today. >>> The third one (akka) is still pending. >>> >>> Best, >>> Stephan >>> >>> >>> >>> On Mon, Oct 24, 2016 at 3:32 PM, Ufuk Celebi <u...@apache.org> wrote: >>> >>>> Hey all, >>>> >>>> I would like to start the discussion for kicking off the next bug >> fix >>>> release, Flink 1.1.4. What do you think about aiming for a RC by >> end >>>> of this week? >>>> >>>> Users reported some instabilities/inconveniences that would be good >> to >>> fix. >>>> >>>> Personally, I would like to backport the following fixes: >>>> >>>> (1) https://issues.apache.org/jira/browse/FLINK-4619: Answer client >> if >>>> savepoint restore fails (Already merged for master, needs minimal >>>> adjustment for 1.1) >>>> (2) https://issues.apache.org/jira/browse/FLINK-4715: Safety net >> for >>>> stuck task cancellation (Already reviewed for master, waiting for >>>> tests to finish of backport) >>>> (3) https://issues.apache.org/jira/browse/FLINK-4510: Always create >>>> CheckpointCoordinator (Already merged for master, needs minimal >>>> adjustments for 1.1) >>>> >>>> Furthermore, I would like to address the following: >>>> >>>> (4) https://issues.apache.org/jira/browse/FLINK-4445: Add option to >>>> ignore unmatched state when restoring from savepoint >>>> (5) https://issues.apache.org/jira/browse/FLINK-4894: Don't block >> on >>>> buffer request after broadcast event >>>> >>>> Strictly speaking, the (4) is not a bug fix. But given that it >> would >>>> only add an optional flag to savepoint restoring and should have >> been >>>> addressed for 1.1.0 already, I would like to get it in. >>>> >>>