Hi all,

Thank you for all your comments! The FLIP has been updated
accordingly. Please let me know if you have any further questions or
comments.

Also, note that many people are on Christmas break, so we will keep
the discussion open for another week.

Best,
Xuannan

On Wed, Dec 27, 2023 at 5:20 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > After some investigation, it turns out those options of input/output
> > format are only publicly exposed in the DataSet docs[2], which is
> > deprecated. Thus, marking them as deprecated and removed in Flink 2.0
> > looks fine to me.
>
> Thanks Xuannan for the detailed investigation, if so, deprecate them
> and removing them in Flink 2.0 looks good to me.
>
> > I think the key of LOCAL_NUMBER_TASK_MANAGER is better as
> > 'minicluster.number-of-taskmanagers' or 'minicluster.taskmanager-number'
> > instead of 'minicluster.number-taskmanager'.
>
> Thanks Hang for the good suggestion! 'minicluster.number-of-taskmanagers'
> sounds good to me, it's similar to taskmanager.numberOfTaskSlots.
>
> Best,
> Rui
>
> On Wed, Dec 27, 2023 at 1:56 PM Hang Ruan <ruanhang1...@gmail.com> wrote:
>>
>> Hi, Rui Fan.
>>
>> Thanks for this FLIP.
>>
>> I think the key of LOCAL_NUMBER_TASK_MANAGER is better as
>> 'minicluster.number-of-taskmanagers' or 'minicluster.taskmanager-number'
>> instead of 'minicluster.number-taskmanager'.
>>
>> Best,
>> Hang
>>
>> Xuannan Su <suxuanna...@gmail.com> 于2023年12月27日周三 12:40写道:
>>
>> > Hi Xintong and Rui,
>> >
>> > Thanks for the quick feedback and the suggestions.
>> >
>> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` should be
>> > "no
>> > > default".
>> >
>> > I have considered both ways of describing the default value. However,
>> > I found out that some of the configurations, such as `web.tmpdir`, put
>> > `System.getProperty()` in the default value [1]. Some are putting the
>> > description in the default value column[2]. So I just picked the first
>> > one. I am fine with either way, so long as they are consistent. WDYT?
>> >
>> > > 3. Simply saying "getting / setting value with string key is discouraged"
>> > > in JavaDoc of get/setString is IMHO a bit confusing. People may have the
>> > > question why would we keep the discouraged interfaces at all. I would
>> > > suggest the following:
>> > > ```
>> > > We encourage users and developers to always use ConfigOption for getting
>> > /
>> > > setting the configurations if possible, for its rich description, type,
>> > > default-value and other supports. The string-key-based getter / setter
>> > > should only be used when ConfigOption is not applicable, e.g., the key is
>> > > programmatically generated in runtime.
>> > > ```
>> >
>> > The suggested comment looks good to me. Thanks for the suggestion. I
>> > will update the comment in the FLIP.
>> >
>> > > 2. So I wonder if we can simply mark them as deprecated and remove in
>> > 2.0.
>> >
>> > After some investigation, it turns out those options of input/output
>> > format are only publicly exposed in the DataSet docs[2], which is
>> > deprecated. Thus, marking them as deprecated and removed in Flink 2.0
>> > looks fine to me.
>> >
>> >
>> > @Rui
>> >
>> > > Configuration has a `public <T> T get(ConfigOption<T> option)` method.
>> > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods?
>> >
>> > +1 Only keep the get(ConfigOption<T> option),
>> > getOptional(ConfigOption<T> option), and set(ConfigOption<T> option, T
>> > value).
>> >
>> > Best,
>> > Xuannan
>> >
>> > [1]
>> > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#web-tmpdir
>> > [2]
>> > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-container-image-ref
>> > [3]
>> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/dataset/overview/#data-sources
>> >
>> >
>> >
>> >
>> > On Tue, Dec 26, 2023 at 8:47 PM Xintong Song <tonysong...@gmail.com>
>> > wrote:
>> > >
>> > > >
>> > > > Configuration has a `public <T> T get(ConfigOption<T> option)` method.
>> > > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)`
>> > methods?
>> > >
>> > >
>> > >
>> > > Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)` methods
>> > > > can be replaced with `public <T> Configuration set(ConfigOption<T>
>> > option,
>> > > > T value)` as well.
>> > >
>> > >
>> > > +1
>> > >
>> > >
>> > > Best,
>> > >
>> > > Xintong
>> > >
>> > >
>> > >
>> > > On Tue, Dec 26, 2023 at 8:44 PM Xintong Song <tonysong...@gmail.com>
>> > wrote:
>> > >
>> > > > These features don't have a public option, but they work. I'm not sure
>> > > >> whether these features are used by some advanced users.
>> > > >> Actually, I think some of them are valuable! For example:
>> > > >>
>> > > >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
>> > > >>   allows users to define the start command of the yarn container.
>> > > >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
>> > > >>   flink job reads all files under the directory even if it has nested
>> > > >> directories.
>> > > >>
>> > > >> This FLIP focuses on the refactor option, I'm afraid these features
>> > are
>> > > >> used
>> > > >> in some production and removing these features will affect some flink
>> > > >> jobs.
>> > > >> So I prefer to keep these features, WDTY?
>> > > >>
>> > > >
>> > > > First of all, I don't think we should support any knobs that users can
>> > > > only learn how to use from reading Flink's internal codes. From this
>> > > > perspective, for existing string-keyed knobs that are not mentioned in
>> > any
>> > > > public documentation, yes we can argue that they are functioning, but
>> > we
>> > > > can also argue that they are not really exposed to users. That means
>> > > > migrating them to ConfigOption is not a pure refactor, but would make
>> > > > something that used to be hidden from users now exposed to users. For
>> > such
>> > > > options, I personally would lean toward not exposing them. If we
>> > consider
>> > > > them as already exposed, then process-wise there's no problem in
>> > > > deprecating some infrequently-used options and removing them in a major
>> > > > version bump, and if they are proved needed later we can add them back
>> > > > anytime. On the other hand, if we consider them as not yet exposed,
>> > then
>> > > > removing them later would be a breaking change.
>> > > >
>> > > >
>> > > > Secondly, I don't really come up with any cases where users need to
>> > tune
>> > > > these knobs. E.g., why would we allow users to customize the yarn
>> > container
>> > > > start command while we already provide `env.java.opts`? And what would
>> > be
>> > > > the problem if Flink just doesn't support nested files? And even worse,
>> > > > such knobs may provide chances for users to shoot themself in the foot.
>> > > > E.g., what if %jvmmem% is missing from a user-provided container start
>> > > > command? Admittedly, there might be a small fraction of advanced users
>> > that
>> > > > know how to use these knobs. However, those users usually have their
>> > own
>> > > > custom fork of Flink, and it should not be a big problem for them to
>> > build
>> > > > such abilities by themselves.
>> > > >
>> > > >
>> > > > Taking a step back, if we decide that some of the mentioned knobs are
>> > > > really useful, we should at least provide enough descriptions to help
>> > users
>> > > > understand when and how to use these options. E.g., the current
>> > description
>> > > > for yarn container command template is far from enough, which does not
>> > > > explain what the placeholders mean, what happens if some placeholders
>> > are
>> > > > missing or if an unknown placeholder is provided.
>> > > >
>> > > >
>> > > > Best,
>> > > >
>> > > > Xintong
>> > > >
>> > > >
>> > > >
>> > > > On Tue, Dec 26, 2023 at 7:39 PM Rui Fan <1996fan...@gmail.com> wrote:
>> > > >
>> > > >> In addition to what is written in FLIP, I found some methods of
>> > > >> Configuration
>> > > >> are not necessary. And I wanna hear more thoughts from all of you.
>> > > >>
>> > > >> Configuration has a `public <T> T get(ConfigOption<T> option)` method.
>> > > >> Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)`
>> > methods?
>> > > >> Such as:
>> > > >> - public int getInteger(ConfigOption<Integer> configOption)
>> > > >> - public String getString(ConfigOption<String> configOption)
>> > > >> - public long getLong(ConfigOption<Long> configOption)
>> > > >> - etc
>> > > >>
>> > > >> I prefer users and flink use `get(ConfigOption<T> option)` instead of
>> > > >> `getXxx(ConfigOption<Xxx> configOption)` based on some reasons:
>> > > >>
>> > > >> 1. All callers can replace it directly without any extra effort.
>> > > >>   `T get(ConfigOption<T> option)` method can replace all
>> > > >>   `Xxx getXxx(ConfigOption<Xxx> configOption)` methods directly.
>> > > >> 2. Callers can call get directly, and users or flink developers
>> > > >>   don't need to care about should they call getInteger or getString.
>> > > >> 3. Flink code is easier to maintain.
>> > > >> 4. `T get(ConfigOption<T> option)` is designed later than
>> > > >>     `Xxx getXxx(ConfigOption<Xxx> configOption)`, I guess if
>> > > >>     `T get(ConfigOption<T> option)` is designed first,
>> > > >>    all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods
>> > > >>   aren't needed.
>> > > >>
>> > > >> Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)`
>> > methods
>> > > >> can be replaced with `public <T> Configuration set(ConfigOption<T>
>> > option,
>> > > >> T value)` as well.
>> > > >>
>> > > >> If they can be marked as deprecated, only `public Xxx
>> > > >> getXxx(ConfigOption<Xxx> configOption, Xxx overrideDefault)`
>> > > >> is keeped after this FLIP. And the rest of getXxx or setXxx can be
>> > removed
>> > > >> in 2.0.
>> > > >>
>> > > >> Looking forward to everyone's feedback and suggestions, thank you!
>> > > >>
>> > > >> Best,
>> > > >> Rui
>> > > >>
>> > > >> On Tue, Dec 26, 2023 at 7:15 PM Rui Fan <1996fan...@gmail.com> wrote:
>> > > >>
>> > > >> > Thanks Xintong for the quick feedback and the good suggestions!
>> > > >> >
>> > > >> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY`
>> > should be
>> > > >> > "no
>> > > >> > > default". We can explain in the description that if not
>> > configured,
>> > > >> > > `System.getProperty("log.file")` will be used, but that is not a
>> > > >> default
>> > > >> > > value.
>> > > >> >
>> > > >> > Explain it in the description is fine for me.
>> > > >> >
>> > > >> > > 2. So I wonder if we can simply mark them as deprecated and
>> > remove in
>> > > >> > 2.0.
>> > > >> >
>> > > >> > These features don't have a public option, but they work. I'm not
>> > sure
>> > > >> > whether these features are used by some advanced users.
>> > > >> > Actually, I think some of them are valuable! For example:
>> > > >> >
>> > > >> > - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
>> > > >> >   allows users to define the start command of the yarn container.
>> > > >> > - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
>> > > >> >   flink job reads all files under the directory even if it has
>> > nested
>> > > >> > directories.
>> > > >> >
>> > > >> > This FLIP focuses on the refactor option, I'm afraid these features
>> > are
>> > > >> > used
>> > > >> > in some production and removing these features will affect some
>> > flink
>> > > >> jobs.
>> > > >> > So I prefer to keep these features, WDTY?
>> > > >> >
>> > > >> > > 3. Simply saying "getting / setting value with string key is
>> > > >> discouraged"
>> > > >> > > in JavaDoc of get/setString is IMHO a bit confusing. People may
>> > have
>> > > >> the
>> > > >> > > question why would we keep the discouraged interfaces at all. I
>> > would
>> > > >> > > suggest the following:
>> > > >> > > ```
>> > > >> > > We encourage users and developers to always use ConfigOption for
>> > > >> getting
>> > > >> > /
>> > > >> > > setting the configurations if possible, for its rich description,
>> > > >> type,
>> > > >> > > default-value and other supports. The string-key-based getter /
>> > setter
>> > > >> > > should only be used when ConfigOption is not applicable, e.g., the
>> > > >> key is
>> > > >> > > programmatically generated in runtime.
>> > > >> > > ```
>> > > >> >
>> > > >> > Suggested comment is good for me, and I'd like to hear the thought
>> > from
>> > > >> > Xuannan
>> > > >> > who wrote the original comment.
>> > > >> >
>> > > >> > Best,
>> > > >> > Rui
>> > > >> >
>> > > >> > On Tue, Dec 26, 2023 at 5:26 PM Xintong Song <tonysong...@gmail.com
>> > >
>> > > >> > wrote:
>> > > >> >
>> > > >> >> Thanks for the efforts, Rui and Xuannan.
>> > > >> >>
>> > > >> >> I think it's a good idea to migrate string-key configuration
>> > accesses
>> > > >> to
>> > > >> >> ConfigOption-s in general.
>> > > >> >>
>> > > >> >> I have a few suggestions / questions regarding the FLIP.
>> > > >> >>
>> > > >> >> 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY`
>> > should be
>> > > >> "no
>> > > >> >> default". We can explain in the description that if not configured,
>> > > >> >> `System.getProperty("log.file")` will be used, but that is not a
>> > > >> default
>> > > >> >> value.
>> > > >> >>
>> > > >> >> 2. I wonder if the following string-keys can be simply removed?
>> > They
>> > > >> are
>> > > >> >> neither set by Flink, nor documented anywhere (AFAIK) so that users
>> > > >> know
>> > > >> >> how to set them. All of them were introduced a long time ago,
>> > require
>> > > >> >> significant knowledge and familiarity about Flink internals and low
>> > > >> level
>> > > >> >> details in order to use, and some of them are even private. So I
>> > > >> wonder if
>> > > >> >> we can simply mark them as deprecated and remove in 2.0.
>> > > >> >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
>> > > >> >> - FileInputFormat.FILE_PARAMETER_KEY
>> > > >> >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG
>> > > >> >> - FileOutputFormat.FILE_PARAMETER_KEY
>> > > >> >> - BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY
>> > > >> >> - BinaryOutputFormat.BLOCK_SIZE_PARAMETER_KEY
>> > > >> >>
>> > > >> >> 3. Simply saying "getting / setting value with string key is
>> > > >> discouraged"
>> > > >> >> in JavaDoc of get/setString is IMHO a bit confusing. People may
>> > have
>> > > >> the
>> > > >> >> question why would we keep the discouraged interfaces at all. I
>> > would
>> > > >> >> suggest the following:
>> > > >> >> ```
>> > > >> >> We encourage users and developers to always use ConfigOption for
>> > > >> getting /
>> > > >> >> setting the configurations if possible, for its rich description,
>> > type,
>> > > >> >> default-value and other supports. The string-key-based getter /
>> > setter
>> > > >> >> should only be used when ConfigOption is not applicable, e.g., the
>> > key
>> > > >> is
>> > > >> >> programmatically generated in runtime.
>> > > >> >> ```
>> > > >> >>
>> > > >> >> WDYT?
>> > > >> >>
>> > > >> >> Best,
>> > > >> >>
>> > > >> >> Xintong
>> > > >> >>
>> > > >> >>
>> > > >> >>
>> > > >> >> On Tue, Dec 26, 2023 at 4:12 PM Rui Fan <1996fan...@gmail.com>
>> > wrote:
>> > > >> >>
>> > > >> >> > Hi all,
>> > > >> >> >
>> > > >> >> > Xuannan(cced) and I would like to start a discussion on FLIP-405:
>> > > >> >> > FLIP-405: Migrate string configuration key to ConfigOption[1].
>> > > >> >> >
>> > > >> >> > As Flink progresses to 2.0, we want to enhance the user
>> > experience
>> > > >> >> > with the existing configuration. In FLIP-77, Flink introduced
>> > > >> >> ConfigOption
>> > > >> >> > with DataType and strongly encourage users to utilize
>> > ConfigOption
>> > > >> >> > instead of string keys for accessing and setting Flink
>> > > >> configurations.
>> > > >> >> > Presently, many string configuration keys have been deprecated
>> > and
>> > > >> >> > replaced with ConfigOptions; however, some string configuration
>> > > >> >> > keys are still in use.
>> > > >> >> >
>> > > >> >> > To ensure a better experience with the existing configuration in
>> > > >> Flink
>> > > >> >> > 2.0,
>> > > >> >> > this FLIP will migrate all user-facing string configuration keys
>> > to
>> > > >> >> > ConfigOptions.
>> > > >> >> > Additionally, we want to modify the Configuration infrastructure
>> > to
>> > > >> >> > promote the use of ConfigOption over string configuration keys
>> > > >> >> > among developers and users. It's mentioned in a preview
>> > thread[2].
>> > > >> >> >
>> > > >> >> > Looking forward to everyone's feedback and suggestions, thank
>> > you!
>> > > >> >> >
>> > > >> >> > [1] https://cwiki.apache.org/confluence/x/6Yr5E
>> > > >> >> > [2]
>> > https://lists.apache.org/thread/zzsf7glfcdjcjm1hfo1xdwc6jp37nb3m
>> > > >> >> >
>> > > >> >> > Best,
>> > > >> >> > Rui
>> > > >> >> >
>> > > >> >>
>> > > >> >
>> > > >>
>> > > >
>> >

Reply via email to