+1 for the JVM metaspace and overhead changes. On Tue, Jan 14, 2020 at 11:19 AM Till Rohrmann <trohrm...@apache.org> wrote:
> I guess one of the most important results of this experiment is to have a > good tuning guide available for users who are past the initial try-out > phase because the default settings will be kind of a compromise. I assume > that this is part of the outstanding FLIP-49 documentation task. > > If we limit RocksDB's memory consumption by default, then I believe that > 0.4 would give the better all-round experience as it leaves a bit more > memory for RocksDB. However, I'm a bit sceptical whether we should optimize > the default settings for a configuration where the user still needs to > activate the strict memory limiting for RocksDB. In this case, I would > expect that the user could also adapt the managed memory fraction. > > Cheers, > Till > > On Tue, Jan 14, 2020 at 3:39 AM Xintong Song <tonysong...@gmail.com> > wrote: > >> Thanks for the feedback, Stephan and Kurt. >> >> @Stephan >> >> Regarding managed memory fraction, >> - It makes sense to keep the default value 0.4, if we assume rocksdb >> memory is limited by default. >> - AFAIK, currently rocksdb by default does not limit its memory usage. >> And I'm positive to change it. >> - Personally, I don't like the idea that we the out-of-box experience >> (for which we set the default fraction) relies on that users will manually >> turn another switch on. >> >> Regarding framework heap memory, >> - The major reason we set it by default is, as you mentioned, that to >> have a safe net of minimal JVM heap size. >> - Also, considering the in progress FLIP-56 (dynamic slot allocation), we >> want to reserve some heap memory that will not go into the slot profiles. >> That's why we decide the default value according to the heap memory usage >> of an empty task executor. >> >> @Kurt >> Regarding metaspace, >> - This config option ("taskmanager.memory.jvm-metaspace") only takes >> effect on TMs. Currently we do not set metaspace size for JM. >> - If we have the same metaspace problem on TMs, then yes, changing it >> from 128M to 64M will make it worse. However, IMO 10T tpc-ds benchmark >> should not be considered as out-of-box experience and it makes sense to >> tune the configurations for it. I think the smaller metaspace size would be >> a better choice for the first trying-out, where a job should not be too >> complicated, the TM size could be relative small (e.g. 1g). >> >> Thank you~ >> >> Xintong Song >> >> >> >> On Tue, Jan 14, 2020 at 9:38 AM Kurt Young <ykt...@gmail.com> wrote: >> >>> HI Xingtong, >>> >>> IIRC during our tpc-ds 10T benchmark, we have suffered by JM's metaspace >>> size and full gc which >>> caused by lots of classloadings of source input split. Could you check >>> whether changing the default >>> value from 128MB to 64MB will make it worse? >>> >>> Correct me if I misunderstood anything, also cc @Jingsong >>> >>> Best, >>> Kurt >>> >>> >>> On Tue, Jan 14, 2020 at 3:44 AM Stephan Ewen <se...@apache.org> wrote: >>> >>>> Hi all! >>>> >>>> Thanks a lot, Xintong, for this thorough analysis. Based on your >>>> analysis, >>>> here are some thoughts: >>>> >>>> +1 to change default JVM metaspace size from 128MB to 64MB >>>> +1 to change default JVM overhead min size from 128MB to 196MB >>>> >>>> Concerning the managed memory fraction, I am not sure I would change it, >>>> for the following reasons: >>>> >>>> - We should assume RocksDB will be limited to managed memory by >>>> default. >>>> This will either be active by default or we would encourage everyone to >>>> use >>>> this by default, because otherwise it is super hard to reason about the >>>> RocksDB footprint. >>>> - For standalone, a managed memory fraction of 0.3 is less than half >>>> of >>>> the managed memory from 1.9. >>>> - I am not sure if the managed memory fraction is a value that all >>>> users >>>> adjust immediately when scaling up the memory during their first try-out >>>> phase. I would assume that most users initially only adjust >>>> "memory.flink.size" or "memory.process.size". A value of 0.3 will lead >>>> to >>>> having too large heaps and very little RocksDB / batch memory even when >>>> scaling up during the initial exploration. >>>> - I agree, though, that 0.5 looks too aggressive, from your >>>> benchmarks. >>>> So maybe keeping it at 0.4 could work? >>>> >>>> And one question: Why do we set the Framework Heap by default? Is that >>>> so >>>> we reduce the managed memory further is less than framework heap would >>>> be >>>> left from the JVM heap? >>>> >>>> Best, >>>> Stephan >>>> >>>> On Thu, Jan 9, 2020 at 10:54 AM Xintong Song <tonysong...@gmail.com> >>>> wrote: >>>> >>>> > Hi all, >>>> > >>>> > As described in FLINK-15145 [1], we decided to tune the default >>>> > configuration values of FLIP-49 with more jobs and cases. >>>> > >>>> > After spending time analyzing and tuning the configurations, I've come >>>> > with several findings. To be brief, I would suggest the following >>>> changes, >>>> > and for more details please take a look at my tuning report [2]. >>>> > >>>> > - Change default managed memory fraction from 0.4 to 0.3. >>>> > - Change default JVM metaspace size from 128MB to 64MB. >>>> > - Change default JVM overhead min size from 128MB to 196MB. >>>> > >>>> > Looking forward to your feedback. >>>> > >>>> > Thank you~ >>>> > >>>> > Xintong Song >>>> > >>>> > >>>> > [1] https://issues.apache.org/jira/browse/FLINK-15145 >>>> > >>>> > [2] >>>> > >>>> https://docs.google.com/document/d/1-LravhQYUIkXb7rh0XnBB78vSvhp3ecLSAgsiabfVkk/edit?usp=sharing >>>> > >>>> > >>>> >>>