@Andrey about the increase in metaspace size
   - I have no concerns for 1.11.0.
   - For 1.10.1 I am not completely sure, because users expect to upgrade
that without config adjustments. That might not be possible with that
change.

On Thu, Mar 12, 2020 at 12:55 PM Andrey Zagrebin <azagrebin.apa...@gmail.com>
wrote:

>
> > About "FLINK-16142 Memory Leak causes Metaspace OOM error on repeated
> job”
>
> My understanding that the issue is basically covered by:
>
> - [FLINK-16225] Metaspace Out Of Memory should be handled as Fatal Error
> in TaskManager
>    no full consensus there but improving error message for existing task
> thread fatal handling could be done at least
>
> - [FLINK-16406] Increase default value for JVM Metaspace to minimise its
> OutOfMemoryError
>    see further
>
> - [FLINK-16246] Exclude "SdkMBeanRegistrySupport" from dynamically loaded
> AWS connectors
>   not sure whether this is a blocker but looks close to be resolved
>
> > About "FLINK-16406 Increase default value for JVM Metaspace"
> >  - Have we consensus that this is okay for a bugfix release? It changes
> > setups, takes away memory from heap / managed memory on existing setups
> > that keep their flink-conf.yaml.
>
> My understanding was that increasing to 256m resolved the reported problems
> and we decided to make the change so I have merged it today as there were
> no more concerns.
> If there are concerns I can revert it.
>
> On the other hand, I think improving the message error with reference to
> the metaspace option should help the most
> because user would not have to read all docs to fix it
> then maybe this change is not even needed.
>
> Best,
> Andrey
>
> > On 12 Mar 2020, at 12:28, Stephan Ewen <se...@apache.org> wrote:
> >
> > Good idea to go ahead with 1.10.1
> >
> > About "FLINK-16142 Memory Leak causes Metaspace OOM error on repeated
> job"
> >  - I don't think we have consensus on the exact solution, yet, and some
> of
> > the changes might also have side effects that are hard to predict, so I
> am
> > not sure we should rush this in.
> >
> > About "FLINK-16406 Increase default value for JVM Metaspace"
> >  - Have we consensus that this is okay for a bugfix release? It changes
> > setups, takes away memory from heap / managed memory on existing setups
> > that keep their flink-conf.yaml.
> >
> > We may need to unblock the release form these two issues and think about
> > having 1.10.2 in the near future.
> >
> > On Thu, Mar 12, 2020 at 7:15 AM Yu Li <car...@gmail.com> wrote:
> >
> >> Thanks for the reminder Jark. Will keep an eye on these two.
> >>
> >> Best Regards,
> >> Yu
> >>
> >>
> >> On Thu, 12 Mar 2020 at 12:32, Jark Wu <imj...@gmail.com> wrote:
> >>
> >>> Thanks for driving this release, Yu!
> >>> +1 to start 1.10.1 release cycle.
> >>>
> >>> From the Table SQL module, I think we should also try to get in the
> >>> following issues:
> >>> - FLINK-16441: Allow users to override flink-conf parameters from SQL
> CLI
> >>> environment
> >>>  this allows users to set e.g. statebackend, watermark interval,
> >>> exactly-once/at-least-once, in the SQL CLI
> >>> - FLINK-16217: SQL Client crashed when any uncatched exception is
> thrown
> >>>  this will improve much experience when using SQL CLI
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>> On Wed, 11 Mar 2020 at 20:37, Yu Li <car...@gmail.com> wrote:
> >>>
> >>>> Thanks for the suggestion Andrey! I've added 1.10.1 into FLINK-16225
> >> fix
> >>>> versions and promoted its priority to Critical. Will also watch the
> >>>> progress of FLINK-16108/FLINK-16408.
> >>>>
> >>>> Best Regards,
> >>>> Yu
> >>>>
> >>>>
> >>>> On Wed, 11 Mar 2020 at 18:18, Andrey Zagrebin <azagre...@apache.org>
> >>>> wrote:
> >>>>
> >>>>> Hi Yu,
> >>>>>
> >>>>> Thanks for kicking off the 1.10.1 release discussion!
> >>>>>
> >>>>> Apart from
> >>>>> - FLINK-16406 Increase default value for JVM Metaspace to minimise
> >> its
> >>>>> OutOfMemoryError
> >>>>> which should be merged soon
> >>>>>
> >>>>> I think we should also try to get in the following issues:
> >>>>>
> >>>>> - [FLINK-16225] Metaspace Out Of Memory should be handled as Fatal
> >>> Error
> >>>> in
> >>>>> TaskManager
> >>>>> This should solve the Metaspace problem even in a better way because
> >>> OOM
> >>>>> failure should point users to the docs immediately
> >>>>>
> >>>>> - [FLINK-16408] Bind user code class loader to lifetime of a slot
> >>>>> This should give a better protection against class loading leaks
> >>>>>
> >>>>> - [FLINK-16018] Improve error reporting when submitting batch job
> >>>> (instead
> >>>>> of AskTimeoutException)
> >>>>> This problem has recently happened for multiple users
> >>>>>
> >>>>> Best,
> >>>>> Andrey
> >>>>>
> >>>>>
> >>>>> On Wed, Mar 11, 2020 at 8:46 AM Jingsong Li <jingsongl...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Thanks for driving. Yu. +1 for starting the 1.10.1 release.
> >>>>>>
> >>>>>> Some issues are very important, Users are looking forward to them.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jingsong Lee
> >>>>>>
> >>>>>> On Wed, Mar 11, 2020 at 2:52 PM Yangze Guo <karma...@gmail.com>
> >>> wrote:
> >>>>>>
> >>>>>>> Thanks for driving this release, Yu!
> >>>>>>>
> >>>>>>> +1 for starting the 1.10.1 release cycle.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Yangze Guo
> >>>>>>>
> >>>>>>> On Wed, Mar 11, 2020 at 1:42 PM Xintong Song <
> >>> tonysong...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> Yu,
> >>>>>>>> Thanks for the explanation.
> >>>>>>>> I've no concerns. I was just trying to get some inputs for
> >>>>> prioritizing
> >>>>>>>> tasks on my side, and ~1month sounds good to me.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thank you~
> >>>>>>>>
> >>>>>>>> Xintong Song
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Mar 11, 2020 at 12:15 PM Yu Li <car...@gmail.com>
> >> wrote:
> >>>>>>>>
> >>>>>>>>> bq. what is the time plan for 1.10.1?
> >>>>>>>>>
> >>>>>>>>> According to the history, the first patch release of a major
> >>>>> version
> >>>>>>> will
> >>>>>>>>> take ~1month from discussion started, depending on the speed
> >> of
> >>>>>> blocker
> >>>>>>>>> issue resolving:
> >>>>>>>>> * 1.8.1: started discussion on May 28th [1], released on Jul
> >>> 3rd
> >>>>> [2]
> >>>>>>>>> * 1.9.1: started discussion on Sep 23rd [3], released on Oct
> >>> 19th
> >>>>> [4]
> >>>>>>>>>
> >>>>>>>>> We won't rush to match the history of course, but could use
> >> it
> >>>> as a
> >>>>>>>>> reference. And please feel free to let me know if any
> >> concerns
> >>>>>> Xintong.
> >>>>>>>>> Thanks.
> >>>>>>>>>
> >>>>>>>>> Best Regards,
> >>>>>>>>> Yu
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-8-1-td29154.html
> >>>>>>>>> [2]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-8-1-released-td30124.html
> >>>>>>>>> [3]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Releasing-Flink-1-9-1-td33343.html
> >>>>>>>>> [4]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-1-9-1-released-td34170.html
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Wed, 11 Mar 2020 at 11:54, Xintong Song <
> >>>> tonysong...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Thanks Yu, for the kick off and volunteering to be the
> >>> release
> >>>>>>> manager.
> >>>>>>>>>>
> >>>>>>>>>> +1 for the proposal.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> One quick question, what is the time plan for 1.10.1?
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Thank you~
> >>>>>>>>>>
> >>>>>>>>>> Xintong Song
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 11, 2020 at 11:51 AM Zhijiang
> >>>>>>>>>> <wangzhijiang...@aliyun.com.invalid> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks for driving this release, Yu!
> >>>>>>>>>>> +1 on my side
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Zhijiang
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>> ------------------------------------------------------------------
> >>>>>>>>>>> From:Yu Li <car...@gmail.com>
> >>>>>>>>>>> Send Time:2020 Mar. 10 (Tue.) 20:25
> >>>>>>>>>>> To:dev <dev@flink.apache.org>
> >>>>>>>>>>> Subject:Re: [DISCUSS] Releasing Flink 1.10.1
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the supplement Hequn. Yes will also keep an
> >> eye
> >>> on
> >>>>>> these
> >>>>>>>>>>> existing blocker issues.
> >>>>>>>>>>>
> >>>>>>>>>>> Best Regards,
> >>>>>>>>>>> Yu
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Tue, 10 Mar 2020 at 19:10, Hequn Cheng <
> >>> he...@apache.org>
> >>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi Yu,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thanks a lot for raising the discussion and volunteer
> >> as
> >>>> the
> >>>>>>> release
> >>>>>>>>>>>> manager!
> >>>>>>>>>>>>
> >>>>>>>>>>>> I found there are some other issues[1] which are marked
> >>> as
> >>>> a
> >>>>>>> blocker:
> >>>>>>>>>>>> - FLINK-16454 Update the copyright year in NOTICE files
> >>>>>>>>>>>> - FLINK-16262 Class loader problem with
> >>>>>>>>>>>> FlinkKafkaProducer.Semantic.EXACTLY_ONCE and usrlib
> >>>> directory
> >>>>>>>>>>>> - FLINK-16170 SearchTemplateRequest
> >>> ClassNotFoundException
> >>>>> when
> >>>>>>> use
> >>>>>>>>>>>> flink-sql-connector-elasticsearch7
> >>>>>>>>>>>> - FLINK-16018 Improve error reporting when submitting
> >>> batch
> >>>>> job
> >>>>>>>>>> (instead
> >>>>>>>>>>> of
> >>>>>>>>>>>> AskTimeoutException)
> >>>>>>>>>>>>
> >>>>>>>>>>>> These may also need to be resolved in 1.10.1.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Hequn
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>> https://issues.apache.org/jira/projects/FLINK/versions/12346891
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Mar 10, 2020 at 6:48 PM Yu Li <
> >> car...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Jincheng,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Yes, your help would be very helpful. Thanks a lot!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, 10 Mar 2020 at 18:24, jincheng sun <
> >>>>>>>>> sunjincheng...@gmail.com
> >>>>>>>>>>>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks for bring up the discussion Yu. I would like
> >>> to
> >>>>> give
> >>>>>>> you a
> >>>>>>>>>>> hand
> >>>>>>>>>>>> at
> >>>>>>>>>>>>>> the last stage when the RC is finished.(If you
> >> need)
> >>>> :)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> Jincheng
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Yu Li <car...@gmail.com> 于2020年3月10日周二 下午5:49写道:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> It has been almost one month since we released
> >>> Flink
> >>>>>>> 1.10.0. We
> >>>>>>>>>>>> already
> >>>>>>>>>>>>>>> have more than 40 resolved improvements/bugs in
> >> the
> >>>>>>>>> release-1.10
> >>>>>>>>>>>>> branch,
> >>>>>>>>>>>>>>> and I propose to start the 1.10.1 release cycle.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Most noticeable fixes are:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - FLINK-16241 [legal] Remove the license and
> >> notice
> >>>>> file
> >>>>>> in
> >>>>>>>>>>>>> flink-ml-lib
> >>>>>>>>>>>>>>> module
> >>>>>>>>>>>>>>> - FLINK-16313 Fix RocksDB resource leak in
> >>>>>>>>>>> flink-state-processor-api
> >>>>>>>>>>>>>>> - FLINK-16161 Statistics zero should be known in
> >>>>>>> HiveCatalog
> >>>>>>>>>>>>>>> - FLINK-2336 ArrayIndexOufOBoundsException in
> >>>>>> TypeExtractor
> >>>>>>>>> when
> >>>>>>>>>>>>> mapping
> >>>>>>>>>>>>>>> - FLINK-16108 StreamSQLExample is failed if
> >> running
> >>>> in
> >>>>>>> blink
> >>>>>>>>>>> planner
> >>>>>>>>>>>>>>> - FLINK-16139 Co-location constraints are not
> >> reset
> >>>> on
> >>>>>> task
> >>>>>>>>>>> recovery
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>> DefaultScheduler
> >>>>>>>>>>>>>>> - FLINK-16414 Create udaf/udtf function using sql
> >>>>> casuing
> >>>>>>>>>>>>>>> ValidationException: SQL validation failed
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Furthermore, I think the following issues should
> >> be
> >>>>>> merged
> >>>>>>>>> before
> >>>>>>>>>>>>> 1.10.1
> >>>>>>>>>>>>>>> release (especially the Metaspace OOM issue):
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - FLINK-16142 Memory Leak causes Metaspace OOM
> >>> error
> >>>> on
> >>>>>>>>> repeated
> >>>>>>>>>>> job
> >>>>>>>>>>>>>>> submission
> >>>>>>>>>>>>>>> - FLINK-16406 Increase default value for JVM
> >>>> Metaspace
> >>>>> to
> >>>>>>>>>> minimise
> >>>>>>>>>>>> its
> >>>>>>>>>>>>>>> OutOfMemoryError
> >>>>>>>>>>>>>>> - FLINK-16047 Blink planner produces wrong
> >>> aggregate
> >>>>>>> results
> >>>>>>>>> with
> >>>>>>>>>>>> state
> >>>>>>>>>>>>>>> clean up
> >>>>>>>>>>>>>>> - FLINK-16070 Blink planner can not extract
> >> correct
> >>>>>> unique
> >>>>>>> key
> >>>>>>>>>> for
> >>>>>>>>>>>>>>> UpsertStreamTableSink
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I would volunteer as the release manager and kick
> >>> off
> >>>>> the
> >>>>>>>>> release
> >>>>>>>>>>>>> process
> >>>>>>>>>>>>>>> once blocker issues are merged. What do you
> >> think?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If there are any concerns or missing blocker
> >> issues
> >>>>> need
> >>>>>>> to be
> >>>>>>>>>>> fixed
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>> 1.10.1, please let me know. Thanks.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>>>>> Yu
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> Best, Jingsong Lee
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Reply via email to