Thanks Robert for taking care of this! On March 24, 2017 at 7:24:59 PM, Robert Metzger (rmetz...@apache.org) wrote:
Hi Gordon, I didn't see your request for a release manager. I'm volunteering to take this one. Its a little bit easier for me as a PMC member to do the actual release in the end. On Fri, Mar 24, 2017 at 7:02 AM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > Update for 1.2.1: > > The last fix was just merged! > > Since nobody else seems interested in managing 1.2.1, I can also help with > this one :) > I’ll create the release candidate over the weekend so we can start the > testing / voting next Monday. > > - Gordon > > On March 22, 2017 at 12:35:25 AM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) > wrote: > > Sorry, I missed one other pending issue for Flink 1.2.1: > > - https://issues.apache.org/jira/browse/FLINK-5972 > Disallow shrinking merging windows. This would replace > https://issues.apache.org/jira/browse/FLINK-5713, which was previously > listed as a blocker for 1.2.1. > Status: PR review pending - https://github.com/apache/flink/pull/3587 > > On March 22, 2017 at 12:23:03 AM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) > wrote: > > Update for Flink 1.2.1: > > There’s only one PR pending that is LGTM - > https://issues.apache.org/jira/browse/FLINK-6084 > Fix for Cassandra connector dropping metrics-core dependency. > > We can proceed to create the release candidate very soon :-) > Release 1.1.5 RC1 seems to be in good shape so far, so hopefully we can > start voting for 1.2.1 tomorrow. > > Also, we’re still lacking a release manager for 1.2.1. Is anyone > interested in volunteering for this release? > If nobody steps up for it before tomorrow, I can also do it. > > Cheers, > Gordon > > On March 18, 2017 at 12:52:48 AM, Robert Metzger (rmetz...@apache.org) > wrote: > > I don't think that his issue should be a reason to hold back a bugfix > release. > There are workarounds for the problem you are describing. Once we've fixed > it, we can include it into the next upcoming bugfix release. > > On Fri, Mar 17, 2017 at 4:22 PM, Flavio Pompermaier <pomperma...@okkam.it> > wrote: > > > I propose to fix https://issues.apache.org/jira/browse/FLINK-6103 before > > issue a release > > > > On Fri, Mar 17, 2017 at 8:12 AM, Ufuk Celebi <u...@apache.org> wrote: > > > > > Cool! Thanks for taking care of this Gordon :-) > > > > > > On Fri, Mar 17, 2017 at 7:13 AM, Tzu-Li (Gordon) Tai > > > <tzuli...@apache.org> wrote: > > > > Update for 1.1.5: > > > > The last fixes for 1.1.5 are in! I will create the RC today and start > > > the vote. > > > > > > > > Cheers, > > > > Gordon > > > > > > > > > > > > On March 17, 2017 at 1:14:53 AM, Robert Metzger (rmetz...@apache.org > ) > > > wrote: > > > > > > > > The cassandra connector is probably not usable in Flink 1.2.0. I > would > > > like > > > > to include a fix in 1.2.1: > > > > https://issues.apache.org/jira/browse/FLINK-6084 > > > > > > > > Please let me know if this fix becomes a blocker for the 1.2.1 > release. > > > If > > > > so, I can validate the fix myself to speed up things. > > > > > > > > On Thu, Mar 16, 2017 at 9:41 AM, Jinkui Shi <shijinkui...@163.com> > > > wrote: > > > > > > > >> @Tzu-li(Fordon)Tai > > > >> > > > >> FLINK-5650 is fix by [1]. Chesnay Scheduler push a PR please. > > > >> > > > >> [1] https://github.com/zentol/flink/tree/5650_python_test_debug < > > > >> https://github.com/zentol/flink/tree/5650_python_test_debug> > > > >> > > > >> > > > >> > 在 2017年3月16日,上午3:37,Stephan Ewen <se...@apache.org> 写道: > > > >> > > > > >> > Thanks for the update! > > > >> > > > > >> > Just merged to 1.2.1 also: [FLINK-5962] [checkpoints] Remove > > scheduled > > > >> > cancel-task from timer queue to prevent memory leaks > > > >> > > > > >> > The remaining issue list looks good, but I would say that (5) is > > > >> optional. > > > >> > It is not a critical production bug. > > > >> > > > > >> > > > > >> > > > > >> > On Wed, Mar 15, 2017 at 5:38 PM, Tzu-Li (Gordon) Tai < > > > >> tzuli...@apache.org> > > > >> > wrote: > > > >> > > > > >> >> Thanks a lot for the updates so far everyone! > > > >> >> > > > >> >> From the discussion so far, the below is the still unfixed > pending > > > >> issues > > > >> >> for 1.1.5 / 1.2.1 release. > > > >> >> > > > >> >> Since there’s only one backport for 1.1.5 left, I think having an > > RC > > > for > > > >> >> 1.1.5 near the end of this week / early next week is very > > promising, > > > as > > > >> >> basically everything is already in. > > > >> >> I’d be happy to volunteer to help manage the release for 1.1.5, > and > > > >> >> prepare the RC when it’s ready :) > > > >> >> > > > >> >> For 1.2.1, we can leave the pending list here for tracking, and > > come > > > >> back > > > >> >> to update it in the near future. > > > >> >> > > > >> >> If there’s anything I missed, please let me know! > > > >> >> > > > >> >> > > > >> >> =========== Still pending for Flink 1.1.5 =========== > > > >> >> > > > >> >> (1) https://issues.apache.org/jira/browse/FLINK-5701 > > > >> >> Broken at-least-once Kafka producer. > > > >> >> Status: backport PR pending - https://github.com/apache/ > > > flink/pull/3549 > > > >> . > > > >> >> Since it is a relatively self-contained change, I expect this to > > be a > > > >> fast > > > >> >> fix. > > > >> >> > > > >> >> > > > >> >> > > > >> >> =========== Still pending for Flink 1.2.1 =========== > > > >> >> > > > >> >> (1) https://issues.apache.org/jira/browse/FLINK-5808 > > > >> >> Fix Missing verification for setParallelism and setMaxParallelism > > > >> >> Status: PR - https://github.com/apache/flink/pull/3509, review > in > > > >> progress > > > >> >> > > > >> >> (2) https://issues.apache.org/jira/browse/FLINK-5713 > > > >> >> Protect against NPE in WindowOperator window cleanup > > > >> >> Status: PR - https://github.com/apache/flink/pull/3535, review > > > pending > > > >> >> > > > >> >> (3) https://issues.apache.org/jira/browse/FLINK-6044 > > > >> >> TypeSerializerSerializationProxy.read() doesn't verify the read > > > buffer > > > >> >> length > > > >> >> Status: Fixed for master, 1.2 backport pending > > > >> >> > > > >> >> (4) https://issues.apache.org/jira/browse/FLINK-5985 > > > >> >> Flink treats every task as stateful (making topology changes > > > impossible) > > > >> >> Status: PR - https://github.com/apache/flink/pull/3543, review > in > > > >> progress > > > >> >> > > > >> >> (5) https://issues.apache.org/jira/browse/FLINK-5650 > > > >> >> Flink-python tests taking up too much time > > > >> >> Status: I think Chesnay currently has some progress with this > one, > > we > > > >> can > > > >> >> see if we want to make this a blocker > > > >> >> > > > >> >> > > > >> >> Cheers, > > > >> >> Gordon > > > >> >> > > > >> >> On March 15, 2017 at 7:16:53 PM, Jinkui Shi ( > shijinkui...@163.com) > > > >> wrote: > > > >> >> > > > >> >> Can we fix this issue in the 1.2.1: > > > >> >> > > > >> >> Flink-python tests cost too long time > > > >> >> https://issues.apache.org/jira/browse/FLINK-5650 < > > > >> >> https://issues.apache.org/jira/browse/FLINK-5650> > > > >> >> > > > >> >>> 在 2017年3月15日,下午6:29,Vladislav Pernin < > vladislav.per...@gmail.com> > > > 写道: > > > >> >>> > > > >> >>> I just tested in in my reproducer. It works. > > > >> >>> > > > >> >>> 2017-03-15 11:22 GMT+01:00 Aljoscha Krettek < > aljos...@apache.org > > >: > > > >> >>> > > > >> >>>> I did in fact just open a PR for > > > >> >>>>> https://issues.apache.org/jira/browse/FLINK-6001 > > > >> >>>>> NPE on TumblingEventTimeWindows with > ContinuousEventTimeTrigger > > > and > > > >> >>>>> allowedLateness > > > >> >>>> > > > >> >>>> > > > >> >>>> On Tue, Mar 14, 2017, at 18:20, Vladislav Pernin wrote: > > > >> >>>>> Hi, > > > >> >>>>> > > > >> >>>>> I would also include the following (not yet resolved) issue in > > the > > > >> >> 1.2.1 > > > >> >>>>> scope : > > > >> >>>>> > > > >> >>>>> https://issues.apache.org/jira/browse/FLINK-6001 > > > >> >>>>> NPE on TumblingEventTimeWindows with > ContinuousEventTimeTrigger > > > and > > > >> >>>>> allowedLateness > > > >> >>>>> > > > >> >>>>> 2017-03-14 17:34 GMT+01:00 Ufuk Celebi <u...@apache.org>: > > > >> >>>>> > > > >> >>>>>> Big +1 Gordon! > > > >> >>>>>> > > > >> >>>>>> I think (10) is very critical to have in 1.2.1. > > > >> >>>>>> > > > >> >>>>>> – Ufuk > > > >> >>>>>> > > > >> >>>>>> > > > >> >>>>>> On Tue, Mar 14, 2017 at 3:37 PM, Stefan Richter > > > >> >>>>>> <s.rich...@data-artisans.com> wrote: > > > >> >>>>>>> Hi, > > > >> >>>>>>> > > > >> >>>>>>> I would suggest to also include in 1.2.1: > > > >> >>>>>>> > > > >> >>>>>>> (9) https://issues.apache.org/jira/browse/FLINK-6044 < > > > >> >>>>>> https://issues.apache.org/jira/browse/FLINK-6044> > > > >> >>>>>>> Replaces unintentional calls to InputStream#read(…) with the > > > >> intended > > > >> >>>>>>> and correct InputStream#readFully(…) > > > >> >>>>>>> Status: PR > > > >> >>>>>>> > > > >> >>>>>>> (10) https://issues.apache.org/jira/browse/FLINK-5985 < > > > >> >>>>>> https://issues.apache.org/jira/browse/FLINK-5985> > > > >> >>>>>>> Flink 1.2 was creating state handles for stateless tasks > which > > > >> caused > > > >> >>>>>> trouble > > > >> >>>>>>> at restore time for users that wanted to do some changes > that > > > only > > > >> >>>>>> include > > > >> >>>>>>> stateless operators to their topology. > > > >> >>>>>>> Status: PR > > > >> >>>>>>> > > > >> >>>>>>> > > > >> >>>>>>>> Am 14.03.2017 um 15:15 schrieb Till Rohrmann < > > > >> trohrm...@apache.org > > > >> >>>>> : > > > >> >>>>>>>> > > > >> >>>>>>>> Thanks for kicking off the discussion Tzu-Li. I'd like to > add > > > the > > > >> >>>>>> following > > > >> >>>>>>>> issues which have already been merged into the 1.2-release > > and > > > >> >>>>>> 1.1-release > > > >> >>>>>>>> branch: > > > >> >>>>>>>> > > > >> >>>>>>>> 1.2.1: > > > >> >>>>>>>> > > > >> >>>>>>>> (7) https://issues.apache.org/jira/browse/FLINK-5942 > > > >> >>>>>>>> Hardens the checkpoint recovery in case of corrupted > > ZooKeeper > > > >> data. > > > >> >>>>>>>> Corrupted checkpoints will now be skipped. > > > >> >>>>>>>> Status: Merged > > > >> >>>>>>>> > > > >> >>>>>>>> (8) https://issues.apache.org/jira/browse/FLINK-5940 > > > >> >>>>>>>> Hardens the checkpoint recovery in case that we cannot > > retrieve > > > >> the > > > >> >>>>>>>> completed checkpoint from the meta data state handle > > retrieved > > > >> from > > > >> >>>>>>>> ZooKeeper. This can, for example, happen if the meta data > is > > > >> >>>> deleted. > > > >> >>>>>>>> Checkpoints with unretrievable state handles are skipped. > > > >> >>>>>>>> Status: Merged > > > >> >>>>>>>> > > > >> >>>>>>>> 1.1.5: > > > >> >>>>>>>> > > > >> >>>>>>>> > > > >> >>>>>>>> (7) https://issues.apache.org/jira/browse/FLINK-5942 > > > >> >>>>>>>> Hardens the checkpoint recovery in case of corrupted > > ZooKeeper > > > >> data. > > > >> >>>>>>>> Corrupted checkpoints will now be skipped. > > > >> >>>>>>>> Status: Merged > > > >> >>>>>>>> > > > >> >>>>>>>> (8) https://issues.apache.org/jira/browse/FLINK-5940 > > > >> >>>>>>>> Hardens the checkpoint recovery in case that we cannot > > retrieve > > > >> the > > > >> >>>>>>>> completed checkpoint from the meta data state handle > > retrieved > > > >> from > > > >> >>>>>>>> ZooKeeper. This can, for example, happen if the meta data > is > > > >> >>>> deleted. > > > >> >>>>>>>> Checkpoints with unretrievable state handles are skipped. > > > >> >>>>>>>> Status: Merged > > > >> >>>>>>>> > > > >> >>>>>>>> Cheers, > > > >> >>>>>>>> Till > > > >> >>>>>>>> > > > >> >>>>>>>> On Tue, Mar 14, 2017 at 12:02 PM, Tzu-Li (Gordon) Tai < > > > >> >>>>>> tzuli...@apache.org> > > > >> >>>>>>>> wrote: > > > >> >>>>>>>> > > > >> >>>>>>>>> Hi all! > > > >> >>>>>>>>> > > > >> >>>>>>>>> I would like to start a discussion for the next bugfix > > release > > > >> for > > > >> >>>>>> 1.1.x > > > >> >>>>>>>>> and 1.2.x. > > > >> >>>>>>>>> There’s been quite a few critical fixes for bugs in both > the > > > >> >>>> releases > > > >> >>>>>>>>> recently, and I think they deserve a bugfix release soon. > > > >> >>>>>>>>> Most of the bugs were reported by users. > > > >> >>>>>>>>> > > > >> >>>>>>>>> I’m starting the discussion for both bugfix releases > because > > > most > > > >> >>>> fixes > > > >> >>>>>>>>> span both releases (almost identical). > > > >> >>>>>>>>> Of course, the actual RC votes and RC creation process > > doesn’t > > > >> >>>> have to > > > >> >>>>>> be > > > >> >>>>>>>>> started together. > > > >> >>>>>>>>> > > > >> >>>>>>>>> Here’s an overview of what’s been collected so far, for > both > > > >> bugfix > > > >> >>>>>>>>> releases - > > > >> >>>>>>>>> (it’s a list of what I’m aware of so far, and may be > missing > > > >> stuff; > > > >> >>>>>> please > > > >> >>>>>>>>> append and bring to attention as necessary :-) ) > > > >> >>>>>>>>> > > > >> >>>>>>>>> > > > >> >>>>>>>>> For Flink 1.2.1: > > > >> >>>>>>>>> > > > >> >>>>>>>>> (1) https://issues.apache.org/jira/browse/FLINK-5701: > > > >> >>>>>>>>> Async exceptions in the FlinkKafkaProducer are not checked > > on > > > >> >>>>>> checkpoints. > > > >> >>>>>>>>> This compromises the producer’s at-least-once guarantee. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (2) https://issues.apache.org/jira/browse/FLINK-5949: > > > >> >>>>>>>>> Do not check Kerberos credentials for non-Kerberos > > > >> authentications. > > > >> >>>>>> MapR > > > >> >>>>>>>>> users are affected by this, and cannot submit Flink on > YARN > > > jobs > > > >> >>>> on a > > > >> >>>>>>>>> secured MapR cluster. > > > >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3528, > one > > > +1 > > > >> >>>> already > > > >> >>>>>>>>> > > > >> >>>>>>>>> (3) https://issues.apache.org/jira/browse/FLINK-6006: > > > >> >>>>>>>>> Kafka Consumer can lose state if queried partition list is > > > >> >>>> incomplete > > > >> >>>>>> on > > > >> >>>>>>>>> restore. > > > >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3505, > one > > > +1 > > > >> >>>> already > > > >> >>>>>>>>> > > > >> >>>>>>>>> (4) https://issues.apache.org/jira/browse/FLINK-6025: > > > >> >>>>>>>>> KryoSerializer may use the wrong classloader when Kryo’s > > > >> >>>>>> JavaSerializer is > > > >> >>>>>>>>> used. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (5) https://issues.apache.org/jira/browse/FLINK-5771: > > > >> >>>>>>>>> Fix multi-char delimiters in Batch InputFormats. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (6) https://issues.apache.org/jira/browse/FLINK-5934: > > > >> >>>>>>>>> Set the Scheduler in the ExecutionGraph via its > constructor. > > > This > > > >> >>>>>> fixes a > > > >> >>>>>>>>> bug that causes HA recovery to fail. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> > > > >> >>>>>>>>> > > > >> >>>>>>>>> For Flink 1.1.5: > > > >> >>>>>>>>> > > > >> >>>>>>>>> (1) https://issues.apache.org/jira/browse/FLINK-5701: > > > >> >>>>>>>>> Async exceptions in the FlinkKafkaProducer are not checked > > on > > > >> >>>>>> checkpoints. > > > >> >>>>>>>>> This compromises the producer’s at-least-once guarantee. > > > >> >>>>>>>>> Status: This is already merged for 1.2.1. I would > personally > > > like > > > >> >>>> to > > > >> >>>>>>>>> backport the fix for this to 1.1.5 also. > > > >> >>>>>>>>> > > > >> >>>>>>>>> (2) https://issues.apache.org/jira/browse/FLINK-6006: > > > >> >>>>>>>>> Kafka Consumer can lose state if queried partition list is > > > >> >>>> incomplete > > > >> >>>>>> on > > > >> >>>>>>>>> restore. > > > >> >>>>>>>>> Status: PR - https://github.com/apache/flink/pull/3507, > one > > > +1 > > > >> >>>> already > > > >> >>>>>>>>> > > > >> >>>>>>>>> (3) https://issues.apache.org/jira/browse/FLINK-6025: > > > >> >>>>>>>>> KryoSerializer may use the wrong classloader when Kryo’s > > > >> >>>>>> JavaSerializer is > > > >> >>>>>>>>> used. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (4) https://issues.apache.org/jira/browse/FLINK-5771: > > > >> >>>>>>>>> Fix multi-char delimiters in Batch InputFormats. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (5) https://issues.apache.org/jira/browse/FLINK-5934: > > > >> >>>>>>>>> Set the Scheduler in the ExecutionGraph via its > constructor. > > > This > > > >> >>>>>> fixes a > > > >> >>>>>>>>> bug that causes HA recovery to fail. > > > >> >>>>>>>>> Status: merged > > > >> >>>>>>>>> > > > >> >>>>>>>>> (6) https://issues.apache.org/jira/browse/FLINK-5048: > > > >> >>>>>>>>> Kafka Consumer (0.9/0.10) threading model leads > problematic > > > >> >>>>>> cancellation > > > >> >>>>>>>>> behavior. > > > >> >>>>>>>>> Status: This fix was already released in 1.2.0, but never > > > made it > > > >> >>>> into > > > >> >>>>>> the > > > >> >>>>>>>>> 1.1.x bugfixes. Do we want to backport this also for > 1.1.5? > > > >> >>>>>>>>> > > > >> >>>>>>>>> > > > >> >>>>>>>>> What do you think? From the list so far, we pretty much > > > already > > > >> >>>> have > > > >> >>>>>>>>> everything in, so I think it would be nice to aim for RCs > by > > > the > > > >> >>>> end of > > > >> >>>>>>>>> this week. > > > >> >>>>>>>>> Since both bugfix releases cover almost the same list of > > > issues, > > > >> I > > > >> >>>>>> think > > > >> >>>>>>>>> it shouldn’t be too hard for us to kick off both bugfix > > > releases > > > >> >>>>>> around the > > > >> >>>>>>>>> same time. > > > >> >>>>>>>>> > > > >> >>>>>>>>> Also FYI, here’s the lists of JIRA tickets tagged with > > > "1.2.1” / > > > >> >>>>>> “1.1.5” > > > >> >>>>>>>>> as the Fix Versions, and are still open. > > > >> >>>>>>>>> We should probably want to check if there’s anything on > > there > > > >> that > > > >> >>>> we > > > >> >>>>>>>>> should block on for the releases: > > > >> >>>>>>>>> > > > >> >>>>>>>>> For 1.2.1: > > > >> >>>>>>>>> https://issues.apache.org/jira/browse/FLINK-5711?jql= > > > >> >>>>>>>>> project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20% > > > >> >>>>>>>>> 22In%20Progress%22%2C%20Reopened)%20AND% > > > >> 20fixVersion%20%3D%201.2.1 > > > >> >>>>>>>>> > > > >> >>>>>>>>> For 1.1.5: > > > >> >>>>>>>>> https://issues.apache.org/jira/browse/FLINK-6006?jql= > > > >> >>>>>>>>> project%20%3D%20FLINK%20AND%20status%20in%20(Open%2C%20% > > > >> >>>>>>>>> 22In%20Progress%22%2C%20Reopened)%20AND% > > > >> 20fixVersion%20%3D%201.1.5 > > > >> >>>>>>> > > > >> >>>>>> > > > >> >>>> > > > >> >>> > > > >> >> > > > >> >> > > > >> > > > >> > > > > > >