Benefit of a backport, as I see it, is increased stability. The danger is potentially breaking some code that was casting FileSystems to subtypes like LocalFileSytem. I don’t know how common that would be in user code.
> Am 28.10.2016 um 14:27 schrieb Ufuk Celebi <u...@apache.org>: > > Thanks for all your feedback. > > If there are no objections, I would like to stick to the mentioned > issues in this thread and create RC1 as soon as they are all > addressed. This will probably not be this week though, but it looks > good for next week. > > DONE > ===== > - FLINK-4619: Answer client if savepoint restore fails > - FLINK-4715: Safety net for stuck task cancellation > - FLINK-4510: Always create CheckpointCoordinator > - FLINK-4894: Don't block on buffer request after broadcast event > - FLINK-4298: Add proper repository for Closure dependencies > - FLINK-4218: Do not fail checkpoints when state size cannot be determined > - FLINK-3347: TaskManager (or its ActorSystem) need to restart in case > they notice quarantine > - FLINK-4875: Use correct operator name > - FLINK-4913: Include user jars in system class loader > > PENDING REVIEW > =============== > - FLINK-4445: Add option to ignore unmatched state when restoring from > savepoint => https://github.com/apache/flink/pull/2713 > - FLINK-4932: Don't let ExecutionGraph fail when in state Restarting > => https://github.com/apache/flink/pull/2711 > - FLINK-4933: ExecutionGraph.scheduleOrUpdateConsumers can fail the > ExecutionGraph => https://github.com/apache/flink/pull/2701 > > OPEN > ===== > - FLINK-4904: Add a limit for how much data may be spilled in > checkpoint alignments => fix pending > - FLINK-4910: Introduce safety net for closing file system streams => > @Stephan, Stefan: What's the conclusion of your discussion whether to > backport this or not? > > > On Wed, Oct 26, 2016 at 9:57 PM, dan bress <danbr...@gmail.com> wrote: >> +1 for this release, >> also +1 to Chesnay's suggesting for including this: [FLINK-4875] [metrics] >> Use correct operator name >> >> Dan >> >> On Wed, Oct 26, 2016 at 5:06 AM Till Rohrmann <trohrm...@apache.org> wrote: >> >>> I'll work on FLINK-3347. Additionally I would like to get in >>> >>> - https://issues.apache.org/jira/browse/FLINK-4932: Don't let >>> ExecutionGraph fail when in state Restarting >>> - https://issues.apache.org/jira/browse/FLINK-4933: >>> ExecutionGraph.scheduleOrUpdateConsumers >>> can fail the ExecutionGraph >>> >>> Cheers, >>> Till >>> >>> On Wed, Oct 26, 2016 at 1:02 PM, Stephan Ewen <se...@apache.org> wrote: >>> >>>> Concerning backporting the "I/O streams safety net" - we need to make >>> sure >>>> that this does not change any behavior that users may implicitly expect. >>>> >>>> >>>> On Wed, Oct 26, 2016 at 11:21 AM, Maximilian Michels <m...@apache.org> >>>> wrote: >>>> >>>>> +1 for a 1.1.4 release >>>>> >>>>> We could backport putting user jars into the system class loader for >>>>> per-job Yarn clusters: https://github.com/apache/flink/pull/2692 >>>>> Arguably, this is somewhat a new feature but it gets rid of duplicate >>>>> class loading issues users experienced in practice. >>>>> >>>>> We already have the following commits on the release-1.1 branch: >>>>> >>>>> 05a5f46 [FLINK-4862] fix Timer register in ContinuousEventTimeTrigger >>>>> 5731672 [FLINK-4581] [table] Fix Table API throwing "No suitable driver >>>>> found for jdbc:calcite" >>>>> 9c87f92 [FLINK-4586] [core] Broken AverageAccumulator >>>>> 210230c [FLINK-4829] snapshot accumulators on a best-effort basis >>>>> c1d6b24 [FLINK-4829] protect user accumulators against concurrent >>> updates >>>>> fe464b4 [FLINK-4709] [core] Fix resource leak in >>>> InputStreamFSInputWrapper >>>>> 9f72698 [FLINK-4108] [scala] Respect ResultTypeQueryable for >>>> InputFormats. >>>>> 9591d50 [FLINK-4506] [DataSet] Fix documentation of CsvOutputFormat >>> about >>>>> incorrect default of allowNullValues >>>>> c9433bf [FLINK-3706] Fix YARN test instability >>>>> 2203f74 [FLINK-4778] [docs] Fix WordCount parameters in CLI examples. >>>>> >>>>> -Max >>>>> >>>>> >>>>> On Wed, Oct 26, 2016 at 7:05 AM, Jean-Baptiste Onofré <j...@nanthrax.net >>>> >>>>> wrote: >>>>>> +1 >>>>>> >>>>>> Looking forward this release ! >>>>>> >>>>>> Regards >>>>>> JB >>>>>> >>>>>> >>>>>> >>>>>> On Oct 25, 2016, 14:43, at 14:43, Robert Metzger < >>> rmetz...@apache.org> >>>>> wrote: >>>>>>> +1 for a bugfix release soon. >>>>>>> >>>>>>> On Tue, Oct 25, 2016 at 10:53 AM, Stephan Ewen <se...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks fort starting this Ufuk. >>>>>>>> >>>>>>>> I would like to add the following issues to 1.1.4: >>>>>>>> >>>>>>>> Build errors due to Storm dependencies *(fix pending)* >>>>>>>> - [FLINK-4298] [storm compatibility] Add proper repository for >>>>>>> Closure >>>>>>>> dependencies. >>>>>>>> >>>>>>>> Stability on S3 considering eventual consistency *(fix pending)* >>>>>>>> - [FLINK-4218] [checkpoints] Do not fail checkpoints when state >>>>>>> size >>>>>>>> cannot be determined >>>>>>>> >>>>>>>> Avoiding Zombie TaskManagers *(still needs to be done)* >>>>>>>> - [FLINK-3347] [akka] TaskManager (or its ActorSystem) need to >>>>>>> restart >>>>>>>> in case they notice quarantine >>>>>>>> >>>>>>>> Adding a limit to the amount of data spilled during checkpoint >>>>>>> alignments >>>>>>>> *(fix >>>>>>>> is work in progress)* >>>>>>>> - [FLINK-4904] [checkpoints] Add a limit for how much data may >>> be >>>>>>>> spilled in checkpoint alignments >>>>>>>> >>>>>>>> >>>>>>>> I can push the first two fixes to the 1.1.4 branch in a bit, the >>>>>>> fourth one >>>>>>>> later today. >>>>>>>> The third one (akka) is still pending. >>>>>>>> >>>>>>>> Best, >>>>>>>> Stephan >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Oct 24, 2016 at 3:32 PM, Ufuk Celebi <u...@apache.org> >>> wrote: >>>>>>>> >>>>>>>>> Hey all, >>>>>>>>> >>>>>>>>> I would like to start the discussion for kicking off the next bug >>>>>>> fix >>>>>>>>> release, Flink 1.1.4. What do you think about aiming for a RC by >>>>>>> end >>>>>>>>> of this week? >>>>>>>>> >>>>>>>>> Users reported some instabilities/inconveniences that would be >>> good >>>>>>> to >>>>>>>> fix. >>>>>>>>> >>>>>>>>> Personally, I would like to backport the following fixes: >>>>>>>>> >>>>>>>>> (1) https://issues.apache.org/jira/browse/FLINK-4619: Answer >>>> client >>>>>>> if >>>>>>>>> savepoint restore fails (Already merged for master, needs minimal >>>>>>>>> adjustment for 1.1) >>>>>>>>> (2) https://issues.apache.org/jira/browse/FLINK-4715: Safety net >>>>>>> for >>>>>>>>> stuck task cancellation (Already reviewed for master, waiting for >>>>>>>>> tests to finish of backport) >>>>>>>>> (3) https://issues.apache.org/jira/browse/FLINK-4510: Always >>>> create >>>>>>>>> CheckpointCoordinator (Already merged for master, needs minimal >>>>>>>>> adjustments for 1.1) >>>>>>>>> >>>>>>>>> Furthermore, I would like to address the following: >>>>>>>>> >>>>>>>>> (4) https://issues.apache.org/jira/browse/FLINK-4445: Add option >>>> to >>>>>>>>> ignore unmatched state when restoring from savepoint >>>>>>>>> (5) https://issues.apache.org/jira/browse/FLINK-4894: Don't >>> block >>>>>>> on >>>>>>>>> buffer request after broadcast event >>>>>>>>> >>>>>>>>> Strictly speaking, the (4) is not a bug fix. But given that it >>>>>>> would >>>>>>>>> only add an optional flag to savepoint restoring and should have >>>>>>> been >>>>>>>>> addressed for 1.1.0 already, I would like to get it in. >>>>>>>>> >>>>>>>> >>>>> >>>> >>>