@Becket One thing that may be non-obvious is that the Configuration class also defines serialization / persistence logic at the moment. So it needs to know the set of types it supports. That stands in the way of an arbitrary generic map type.
@Timo I agree though that it seems a bit inconsistent to have one collection orthogonal to the type (List) and another one bound to specific types (Map). On Thu, Aug 29, 2019 at 8:20 AM Becket Qin <becket....@gmail.com> wrote: > Hi Timo, > > Thanks for the proposal. Sorry for the late comments, but I have a few > questions / comments. > > 1. Is a new field of isList necessary in the ConfigOption? > Would it be enough to just check the atomicClass to see if it is a List or > not? > Also, in the ConfigOption<Map> class case, are we always assume both key > and value types are String? Can we just apply the same to the > ConfigOption<List>? > BTW, I did a quick search in the codebase but did not find any usage of > ConfigOption<List>. > > 2. The same config name, but with two ConfigOption with different semantic > in different component seems super confusing. For example, when users set > both configs, they may have no idea one is overriding the other. There > might be two cases: > - If it is just the same config used by different components to act > accordingly, it might be better to just have one config, but describe > clearly on how that config will be used. > - If it is actually two configurations that can be set differently, I > think the config names should just be different. > > 3. Regarding the ConfigurableFactory, is the toConfiguration() method > pretty much means getConfiguration()? The toConfiguration() method sounds > like converting an object to a configuration, which only works if the > object does not contain any state / value. I am also wondering if there is > a real use case of this method. Because supposedly the configurations could > just be passed around to caller of this method. > > Also, can you put the proposal into the FLIP wiki instead of in the Google > doc before voting? The FLIP wiki allows track the modification history and > has a more established structure to ensure nothing is missed. > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Aug 27, 2019 at 11:34 PM Timo Walther <twal...@apache.org> wrote: > > > Hi everyone, > > > > I updated the FLIP proposal one more time as mentioned in the voting > > thread. If there are no objections, I will start a new voting thread > > tomorrow at 9am Berlin time. > > > > Thanks, > > Timo > > > > > > On 22.08.19 14:19, Timo Walther wrote: > > > Hi everyone, > > > > > > thanks for all the feedback we have received online and offline. It > > > showed that many people support the idea of evolving the Flink > > > configuration functionality. I'm almost sure that this FLIP will not > > > solve all issues but at least will improve the current status. > > > > > > We've updated the document and replaced the Correlation part with the > > > concept of a ConfigOptionGroup that can provide all available options > > > of a group plus custom group validators for eager validation. For now, > > > this eager group validation will only be used at certain locations in > > > the Flink code but it prepares for maybe validating the entire global > > > configuration before submitting a job in the future. > > > > > > Please take another look if you find time. I hope we can proceed with > > > the voting process if there are no objections. > > > > > > Regards, > > > Timo > > > > > > Am 19.08.19 um 12:54 schrieb Timo Walther: > > >> Hi Stephan, > > >> > > >> thanks for your suggestions. Let me give you some background about > > >> the decisions made in this FLIP: > > >> > > >> 1. Goal: The FLIP is labelled "evolve" not "rework" because we did > > >> not want to change the entire configuration infrastructure. Both for > > >> backwards-compatibility reasons and the amount of work that would be > > >> required to update all options. If our goal is to rework the > > >> configuration option entirely, I might suggest to switch to JSON > > >> format with JSON schema and JSON validator. However, setting > > >> properties in a CLI or web interface becomes more tricky the more > > >> nested structures are allowed. > > >> > > >> 2. Class-based Options: The current ConfigOption<T> class is centered > > >> around Java classes where T is determined by the default value. The > > >> FLIP just makes this more explicit by offering an explicit > > >> `intType()` method etc. The current design of validators centered > > >> around Java classes makes it possible to have typical domain > > >> validators baked by generics as you suggested. If we introduce types > > >> such as "quantity with measure and unit" we still need to get a class > > >> out of this option at the end, so why changing a proven concept? > > >> > > >> 3. List Options: The `isList` prevents having arbitrary nesting. As > > >> Dawid mentioned, we kept human readability in mind. For every atomic > > >> option like "key=12" can be represented by a list "keys=12;13". But > > >> we don't want to go further; esp. no nesting. A dedicated list option > > >> would start making this more complicated such as > > >> "ListOption(ObjectOption(ListOption(IntOption, ...), > > >> StringOption(...)))", do we want that? > > >> > > >> 4. Correlation: The correlation part is one of the suggestions that I > > >> like least in the document. We can also discuss removing it entirely, > > >> but I think it solves the use case of relating options with each > > >> other in a flexible way right next to the actual option. Instead of > > >> being hidden in some component initialization, we should put it close > > >> to the option to also perform validation eagerly instead of failing > > >> at runtime when the option is accessed the first time. > > >> > > >> Regards, > > >> Timo > > >> > > >> > > >> Am 18.08.19 um 23:32 schrieb Stephan Ewen: > > >>> A "List Type" sounds like a good direction to me. > > >>> > > >>> The comment on the type system was a bit brief, I agree. The idea is > > >>> to see > > >>> if something like that can ease validation. Especially the > correlation > > >>> system seems quite complex (proxies to work around order of > > >>> initialization). > > >>> > > >>> For example, let's assume we don't think primarily about "java > > >>> types" but > > >>> would define types as one of the following (just examples, haven't > > >>> thought > > >>> all the details through): > > >>> > > >>> (a) category type: implies string, and a fix set of possible > values. > > >>> Those would be passes and naturally make it into the docs and > > >>> validation. > > >>> Maps to a String or Enum in Java. > > >>> > > >>> (b) numeric integer type: implies long (or optionally integer, if > > >>> we want > > >>> to automatically check overflow / underflow). would take typical > domain > > >>> validators, like non-negative, etc. > > >>> > > >>> (c) numeric real type: same as above (double or float) > > >>> > > >>> (d) numeric interval type: either defined as an interval, or > > >>> references > > >>> other parameter by key. validation by valid interval. > > >>> > > >>> (e) quantity: a measure and a unit. separately parsable. The > > >>> measure's > > >>> type could be any of the numeric types above, with same validation > > >>> rules. > > >>> > > >>> With a system like the above, would we still correlation validators? > > >>> Are > > >>> there still cases that we need to catch early (config loading) or > > >>> are the > > >>> remaining cases sufficiently rare and runtime or setup specific, > > >>> that it is > > >>> fine to handle them in component initialization? > > >>> > > >>> > > >>> On Sun, Aug 18, 2019 at 6:36 PM Dawid Wysakowicz > > >>> <dwysakow...@apache.org> > > >>> wrote: > > >>> > > >>>> Hi Stephan, > > >>>> > > >>>> Thank you for your opinion. > > >>>> > > >>>> Actually list/composite types are the topics we spent the most of > the > > >>>> time. I understand that from a perspective of a full blown type > > >>>> system, > > >>>> a field like isList may look weird. Please let me elaborate a bit > more > > >>>> on the reason behind it though. Maybe we weren't clear enough about > it > > >>>> in the FLIP. The key feature of all the conifg options is that they > > >>>> must > > >>>> have a string representation as they might come from a configuration > > >>>> file. Moreover it must be a human readable format, so that the > values > > >>>> might be manually adjusted. Having that in mind we did not want to > > >>>> add a > > >>>> support of an arbitrary nesting and we decided to allow for lists > only > > >>>> (and flat objects - I think though in the current design there is a > > >>>> mistake around the Configurable interface). I think though you have > a > > >>>> point here and it would be better to have a ListConfigOption > > >>>> instead of > > >>>> this field. Does it make sense to you? > > >>>> > > >>>> As for the second part of your message. I am not sure if I > understood > > >>>> it. The validators work with parse/deserialized values from > > >>>> Configuration that means they can be bound to the generic parameter > of > > >>>> Configuration. You can have a RangeValidator<? extends > > >>>> Comparable/Number>. I don't think the type hierarchy in the > > >>>> ConfigOption > > >>>> has anything to do with the validation logic. Could you elaborate a > > >>>> bit > > >>>> more what did you mean? > > >>>> > > >>>> Best, > > >>>> > > >>>> Dawid > > >>>> > > >>>> On 18/08/2019 16:42, Stephan Ewen wrote: > > >>>>> I like the idea of enhancing the configuration and to do early > > >>>> validation. > > >>>>> I feel that some of the ideas in the FLIP seem a bit ad hoc, > > >>>>> though. For > > >>>>> example, having a boolean "isList" is a clear indication of not > > >>>>> having > > >>>>> thought through the type/category system. > > >>>>> Also, having a more clear category system makes validation simpler. > > >>>>> > > >>>>> For example, I have seen systems distinguishing between numeric > > >>>> parameters > > >>>>> (valid ranges), category parameters (set of possible values), > > >>>>> quantities > > >>>>> like duration and memory size (need measure and unit), which > > >>>>> results in > > >>>> an > > >>>>> elegant system for validation. > > >>>>> > > >>>>> > > >>>>> On Fri, Aug 16, 2019 at 5:22 PM JingsongLee < > lzljs3620...@aliyun.com > > >>>> .invalid> > > >>>>> wrote: > > >>>>> > > >>>>>> +1 to this, thanks Timo and Dawid for the design. > > >>>>>> This allows the currently cluttered configuration of various > > >>>>>> modules to be unified. > > >>>>>> This is also first step of one of the keys to making new unified > > >>>>>> TableEnvironment available for production. > > >>>>>> > > >>>>>> Previously, we did encounter complex configurations, such as > > >>>>>> specifying the skewed values of column in DDL. The skew may > > >>>>>> be a single field or a combination of multiple fields. So the > > >>>>>> configuration is very troublesome. We used JSON string to > > >>>>>> configure it. > > >>>>>> > > >>>>>> Best, > > >>>>>> Jingsong Lee > > >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> ------------------------------------------------------------------ > > >>>>>> From:Jark Wu <imj...@gmail.com> > > >>>>>> Send Time:2019年8月16日(星期五) 16:44 > > >>>>>> To:dev <dev@flink.apache.org> > > >>>>>> Subject:Re: [DISCUSS] FLIP-54: Evolve ConfigOption and > Configuration > > >>>>>> > > >>>>>> Thanks for starting this design Timo and Dawid, > > >>>>>> > > >>>>>> Improving ConfigOption has been hovering in my mind for a long > time. > > >>>>>> We have seen the benefit when developing blink configurations and > > >>>> connector > > >>>>>> properties in 1.9 release. > > >>>>>> Thanks for bringing it up and make such a detailed design. > > >>>>>> I will leave my thoughts and comments there. > > >>>>>> > > >>>>>> Cheers, > > >>>>>> Jark > > >>>>>> > > >>>>>> > > >>>>>> On Fri, 16 Aug 2019 at 22:30, Zili Chen <wander4...@gmail.com> > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hi Timo, > > >>>>>>> > > >>>>>>> It looks interesting. Thanks for preparing this FLIP! > > >>>>>>> > > >>>>>>> Client API enhancement benefit from this evolution which > > >>>>>>> hopefully provides a better view of configuration of Flink. > > >>>>>>> In client API enhancement, we likely make the deployment > > >>>>>>> of cluster and submission of job totally defined by > configuration. > > >>>>>>> > > >>>>>>> Will take a look at the document in days. > > >>>>>>> > > >>>>>>> Best, > > >>>>>>> tison. > > >>>>>>> > > >>>>>>> > > >>>>>>> Timo Walther <twal...@apache.org> 于2019年8月16日周五 > > >>>>>>> 下午10:12写道: > > >>>>>>> > > >>>>>>>> Hi everyone, > > >>>>>>>> > > >>>>>>>> Dawid and I are working on making parts of ExecutionConfig and > > >>>>>>>> TableConfig configurable via config options. This is necessary > > >>>>>>>> to make > > >>>>>>>> all properties also available in SQL. Additionally, with the > > >>>>>>>> new SQL > > >>>>>> DDL > > >>>>>>>> based on properties as well as more connectors and formats > > >>>>>>>> coming up, > > >>>>>>>> unified configuration becomes more important. > > >>>>>>>> > > >>>>>>>> We need more features around string-based configuration in the > > >>>>>>>> future, > > >>>>>>>> which is why Dawid and I would like to propose FLIP-54 for > > >>>>>>>> evolving > > >>>> the > > >>>>>>>> ConfigOption and Configuration classes: > > >>>>>>>> > > >>>>>>>> > > >>>>>>>> > > >>>> > > > https://docs.google.com/document/d/1IQ7nwXqmhCy900t2vQLEL3N2HIdMg-JO8vTzo1BtyKU/edit > > >>>> > > >>>>>>>> In summary it adds: > > >>>>>>>> - documented types and validation > > >>>>>>>> - more common types such as memory size, duration, list > > >>>>>>>> - simple non-nested object types > > >>>>>>>> > > >>>>>>>> Looking forward to your feedback, > > >>>>>>>> Timo > > >>>>>>>> > > >>>>>>>> > > >>>> > > > > >