I like the idea of having a larger default "flink.size" in the config.yaml. Maybe we don't need to double it, but something like 1280m would be okay?
On Tue, Jan 14, 2020 at 3:47 PM Andrey Zagrebin <azagre...@apache.org> wrote: > Hi all! > > Great that we have already tried out new FLIP-49 with the bigger jobs. > > I am also +1 for the JVM metaspace and overhead changes. > > Regarding 0.3 vs 0.4 for managed memory, +1 for having more managed memory > for Rocksdb limiting case. > > In general, this looks mostly to be about memory distribution between JVM > heap and managed off-heap. > Comparing to the previous default setup, the JVM heap dropped (especially > for standalone) mostly due to moving managed from heap to off-heap and then > also adding framework off-heap. > In general, this can be the most important consequence for beginners and > those who rely on the default configuration. > Especially the legacy default configuration in standalone with falling back > heap.size to flink.size but there it seems we cannot do too much now. > > I prepared a spreadsheet > < > https://docs.google.com/spreadsheets/d/1mJaMkMPfDJJ-w6nMXALYmTc4XxiV30P5U7DzgwLkSoE > > > to play with numbers for the mentioned in the report setups. > > One idea would be to set process size (or smaller flink size respectively) > to a bigger default number, like 2048. > In this case, the abs derived default JVM heap and managed memory are close > to the previous defaults, especially for managed fraction 0.3. > This should align the defaults with the previous standalone try-out > experience where the increased off-heap memory is not strictly controlled > by the environment anyways. > The consequence for container users who relied on and updated the default > configuration is that the containers will be requested with the double > size. > > Best, > Andrey > > > On Tue, Jan 14, 2020 at 11:20 AM Till Rohrmann <trohrm...@apache.org> > wrote: > > > +1 for the JVM metaspace and overhead changes. > > > > On Tue, Jan 14, 2020 at 11:19 AM Till Rohrmann <trohrm...@apache.org> > > wrote: > > > >> I guess one of the most important results of this experiment is to have > a > >> good tuning guide available for users who are past the initial try-out > >> phase because the default settings will be kind of a compromise. I > assume > >> that this is part of the outstanding FLIP-49 documentation task. > >> > >> If we limit RocksDB's memory consumption by default, then I believe that > >> 0.4 would give the better all-round experience as it leaves a bit more > >> memory for RocksDB. However, I'm a bit sceptical whether we should > optimize > >> the default settings for a configuration where the user still needs to > >> activate the strict memory limiting for RocksDB. In this case, I would > >> expect that the user could also adapt the managed memory fraction. > >> > >> Cheers, > >> Till > >> > >> On Tue, Jan 14, 2020 at 3:39 AM Xintong Song <tonysong...@gmail.com> > >> wrote: > >> > >>> Thanks for the feedback, Stephan and Kurt. > >>> > >>> @Stephan > >>> > >>> Regarding managed memory fraction, > >>> - It makes sense to keep the default value 0.4, if we assume rocksdb > >>> memory is limited by default. > >>> - AFAIK, currently rocksdb by default does not limit its memory usage. > >>> And I'm positive to change it. > >>> - Personally, I don't like the idea that we the out-of-box experience > >>> (for which we set the default fraction) relies on that users will > manually > >>> turn another switch on. > >>> > >>> Regarding framework heap memory, > >>> - The major reason we set it by default is, as you mentioned, that to > >>> have a safe net of minimal JVM heap size. > >>> - Also, considering the in progress FLIP-56 (dynamic slot allocation), > >>> we want to reserve some heap memory that will not go into the slot > >>> profiles. That's why we decide the default value according to the heap > >>> memory usage of an empty task executor. > >>> > >>> @Kurt > >>> Regarding metaspace, > >>> - This config option ("taskmanager.memory.jvm-metaspace") only takes > >>> effect on TMs. Currently we do not set metaspace size for JM. > >>> - If we have the same metaspace problem on TMs, then yes, changing it > >>> from 128M to 64M will make it worse. However, IMO 10T tpc-ds benchmark > >>> should not be considered as out-of-box experience and it makes sense to > >>> tune the configurations for it. I think the smaller metaspace size > would be > >>> a better choice for the first trying-out, where a job should not be too > >>> complicated, the TM size could be relative small (e.g. 1g). > >>> > >>> Thank you~ > >>> > >>> Xintong Song > >>> > >>> > >>> > >>> On Tue, Jan 14, 2020 at 9:38 AM Kurt Young <ykt...@gmail.com> wrote: > >>> > >>>> HI Xingtong, > >>>> > >>>> IIRC during our tpc-ds 10T benchmark, we have suffered by JM's > >>>> metaspace size and full gc which > >>>> caused by lots of classloadings of source input split. Could you check > >>>> whether changing the default > >>>> value from 128MB to 64MB will make it worse? > >>>> > >>>> Correct me if I misunderstood anything, also cc @Jingsong > >>>> > >>>> Best, > >>>> Kurt > >>>> > >>>> > >>>> On Tue, Jan 14, 2020 at 3:44 AM Stephan Ewen <se...@apache.org> > wrote: > >>>> > >>>>> Hi all! > >>>>> > >>>>> Thanks a lot, Xintong, for this thorough analysis. Based on your > >>>>> analysis, > >>>>> here are some thoughts: > >>>>> > >>>>> +1 to change default JVM metaspace size from 128MB to 64MB > >>>>> +1 to change default JVM overhead min size from 128MB to 196MB > >>>>> > >>>>> Concerning the managed memory fraction, I am not sure I would change > >>>>> it, > >>>>> for the following reasons: > >>>>> > >>>>> - We should assume RocksDB will be limited to managed memory by > >>>>> default. > >>>>> This will either be active by default or we would encourage everyone > >>>>> to use > >>>>> this by default, because otherwise it is super hard to reason about > the > >>>>> RocksDB footprint. > >>>>> - For standalone, a managed memory fraction of 0.3 is less than > half > >>>>> of > >>>>> the managed memory from 1.9. > >>>>> - I am not sure if the managed memory fraction is a value that all > >>>>> users > >>>>> adjust immediately when scaling up the memory during their first > >>>>> try-out > >>>>> phase. I would assume that most users initially only adjust > >>>>> "memory.flink.size" or "memory.process.size". A value of 0.3 will > lead > >>>>> to > >>>>> having too large heaps and very little RocksDB / batch memory even > when > >>>>> scaling up during the initial exploration. > >>>>> - I agree, though, that 0.5 looks too aggressive, from your > >>>>> benchmarks. > >>>>> So maybe keeping it at 0.4 could work? > >>>>> > >>>>> And one question: Why do we set the Framework Heap by default? Is > that > >>>>> so > >>>>> we reduce the managed memory further is less than framework heap > would > >>>>> be > >>>>> left from the JVM heap? > >>>>> > >>>>> Best, > >>>>> Stephan > >>>>> > >>>>> On Thu, Jan 9, 2020 at 10:54 AM Xintong Song <tonysong...@gmail.com> > >>>>> wrote: > >>>>> > >>>>> > Hi all, > >>>>> > > >>>>> > As described in FLINK-15145 [1], we decided to tune the default > >>>>> > configuration values of FLIP-49 with more jobs and cases. > >>>>> > > >>>>> > After spending time analyzing and tuning the configurations, I've > >>>>> come > >>>>> > with several findings. To be brief, I would suggest the following > >>>>> changes, > >>>>> > and for more details please take a look at my tuning report [2]. > >>>>> > > >>>>> > - Change default managed memory fraction from 0.4 to 0.3. > >>>>> > - Change default JVM metaspace size from 128MB to 64MB. > >>>>> > - Change default JVM overhead min size from 128MB to 196MB. > >>>>> > > >>>>> > Looking forward to your feedback. > >>>>> > > >>>>> > Thank you~ > >>>>> > > >>>>> > Xintong Song > >>>>> > > >>>>> > > >>>>> > [1] https://issues.apache.org/jira/browse/FLINK-15145 > >>>>> > > >>>>> > [2] > >>>>> > > >>>>> > https://docs.google.com/document/d/1-LravhQYUIkXb7rh0XnBB78vSvhp3ecLSAgsiabfVkk/edit?usp=sharing > >>>>> > > >>>>> > > >>>>> > >>>> >