Hi all,

Thanks for the discussion. I think all the comments and questions have
been addressed. I will open the voting thread today.

Best,
Xuannan


On Tue, Jan 2, 2024 at 11:59 AM Xuannan Su <suxuanna...@gmail.com> wrote:
>
> Hi all,
>
> Thank you for all your comments! The FLIP has been updated
> accordingly. Please let me know if you have any further questions or
> comments.
>
> Also, note that many people are on Christmas break, so we will keep
> the discussion open for another week.
>
> Best,
> Xuannan
>
> On Wed, Dec 27, 2023 at 5:20 PM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > After some investigation, it turns out those options of input/output
> > > format are only publicly exposed in the DataSet docs[2], which is
> > > deprecated. Thus, marking them as deprecated and removed in Flink 2.0
> > > looks fine to me.
> >
> > Thanks Xuannan for the detailed investigation, if so, deprecate them
> > and removing them in Flink 2.0 looks good to me.
> >
> > > I think the key of LOCAL_NUMBER_TASK_MANAGER is better as
> > > 'minicluster.number-of-taskmanagers' or 'minicluster.taskmanager-number'
> > > instead of 'minicluster.number-taskmanager'.
> >
> > Thanks Hang for the good suggestion! 'minicluster.number-of-taskmanagers'
> > sounds good to me, it's similar to taskmanager.numberOfTaskSlots.
> >
> > Best,
> > Rui
> >
> > On Wed, Dec 27, 2023 at 1:56 PM Hang Ruan <ruanhang1...@gmail.com> wrote:
> >>
> >> Hi, Rui Fan.
> >>
> >> Thanks for this FLIP.
> >>
> >> I think the key of LOCAL_NUMBER_TASK_MANAGER is better as
> >> 'minicluster.number-of-taskmanagers' or 'minicluster.taskmanager-number'
> >> instead of 'minicluster.number-taskmanager'.
> >>
> >> Best,
> >> Hang
> >>
> >> Xuannan Su <suxuanna...@gmail.com> 于2023年12月27日周三 12:40写道:
> >>
> >> > Hi Xintong and Rui,
> >> >
> >> > Thanks for the quick feedback and the suggestions.
> >> >
> >> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY` should be
> >> > "no
> >> > > default".
> >> >
> >> > I have considered both ways of describing the default value. However,
> >> > I found out that some of the configurations, such as `web.tmpdir`, put
> >> > `System.getProperty()` in the default value [1]. Some are putting the
> >> > description in the default value column[2]. So I just picked the first
> >> > one. I am fine with either way, so long as they are consistent. WDYT?
> >> >
> >> > > 3. Simply saying "getting / setting value with string key is 
> >> > > discouraged"
> >> > > in JavaDoc of get/setString is IMHO a bit confusing. People may have 
> >> > > the
> >> > > question why would we keep the discouraged interfaces at all. I would
> >> > > suggest the following:
> >> > > ```
> >> > > We encourage users and developers to always use ConfigOption for 
> >> > > getting
> >> > /
> >> > > setting the configurations if possible, for its rich description, type,
> >> > > default-value and other supports. The string-key-based getter / setter
> >> > > should only be used when ConfigOption is not applicable, e.g., the key 
> >> > > is
> >> > > programmatically generated in runtime.
> >> > > ```
> >> >
> >> > The suggested comment looks good to me. Thanks for the suggestion. I
> >> > will update the comment in the FLIP.
> >> >
> >> > > 2. So I wonder if we can simply mark them as deprecated and remove in
> >> > 2.0.
> >> >
> >> > After some investigation, it turns out those options of input/output
> >> > format are only publicly exposed in the DataSet docs[2], which is
> >> > deprecated. Thus, marking them as deprecated and removed in Flink 2.0
> >> > looks fine to me.
> >> >
> >> >
> >> > @Rui
> >> >
> >> > > Configuration has a `public <T> T get(ConfigOption<T> option)` method.
> >> > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)` 
> >> > > methods?
> >> >
> >> > +1 Only keep the get(ConfigOption<T> option),
> >> > getOptional(ConfigOption<T> option), and set(ConfigOption<T> option, T
> >> > value).
> >> >
> >> > Best,
> >> > Xuannan
> >> >
> >> > [1]
> >> > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#web-tmpdir
> >> > [2]
> >> > https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/config/#kubernetes-container-image-ref
> >> > [3]
> >> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/dataset/overview/#data-sources
> >> >
> >> >
> >> >
> >> >
> >> > On Tue, Dec 26, 2023 at 8:47 PM Xintong Song <tonysong...@gmail.com>
> >> > wrote:
> >> > >
> >> > > >
> >> > > > Configuration has a `public <T> T get(ConfigOption<T> option)` 
> >> > > > method.
> >> > > > Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)`
> >> > methods?
> >> > >
> >> > >
> >> > >
> >> > > Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)` 
> >> > > methods
> >> > > > can be replaced with `public <T> Configuration set(ConfigOption<T>
> >> > option,
> >> > > > T value)` as well.
> >> > >
> >> > >
> >> > > +1
> >> > >
> >> > >
> >> > > Best,
> >> > >
> >> > > Xintong
> >> > >
> >> > >
> >> > >
> >> > > On Tue, Dec 26, 2023 at 8:44 PM Xintong Song <tonysong...@gmail.com>
> >> > wrote:
> >> > >
> >> > > > These features don't have a public option, but they work. I'm not 
> >> > > > sure
> >> > > >> whether these features are used by some advanced users.
> >> > > >> Actually, I think some of them are valuable! For example:
> >> > > >>
> >> > > >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
> >> > > >>   allows users to define the start command of the yarn container.
> >> > > >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
> >> > > >>   flink job reads all files under the directory even if it has 
> >> > > >> nested
> >> > > >> directories.
> >> > > >>
> >> > > >> This FLIP focuses on the refactor option, I'm afraid these features
> >> > are
> >> > > >> used
> >> > > >> in some production and removing these features will affect some 
> >> > > >> flink
> >> > > >> jobs.
> >> > > >> So I prefer to keep these features, WDTY?
> >> > > >>
> >> > > >
> >> > > > First of all, I don't think we should support any knobs that users 
> >> > > > can
> >> > > > only learn how to use from reading Flink's internal codes. From this
> >> > > > perspective, for existing string-keyed knobs that are not mentioned 
> >> > > > in
> >> > any
> >> > > > public documentation, yes we can argue that they are functioning, but
> >> > we
> >> > > > can also argue that they are not really exposed to users. That means
> >> > > > migrating them to ConfigOption is not a pure refactor, but would make
> >> > > > something that used to be hidden from users now exposed to users. For
> >> > such
> >> > > > options, I personally would lean toward not exposing them. If we
> >> > consider
> >> > > > them as already exposed, then process-wise there's no problem in
> >> > > > deprecating some infrequently-used options and removing them in a 
> >> > > > major
> >> > > > version bump, and if they are proved needed later we can add them 
> >> > > > back
> >> > > > anytime. On the other hand, if we consider them as not yet exposed,
> >> > then
> >> > > > removing them later would be a breaking change.
> >> > > >
> >> > > >
> >> > > > Secondly, I don't really come up with any cases where users need to
> >> > tune
> >> > > > these knobs. E.g., why would we allow users to customize the yarn
> >> > container
> >> > > > start command while we already provide `env.java.opts`? And what 
> >> > > > would
> >> > be
> >> > > > the problem if Flink just doesn't support nested files? And even 
> >> > > > worse,
> >> > > > such knobs may provide chances for users to shoot themself in the 
> >> > > > foot.
> >> > > > E.g., what if %jvmmem% is missing from a user-provided container 
> >> > > > start
> >> > > > command? Admittedly, there might be a small fraction of advanced 
> >> > > > users
> >> > that
> >> > > > know how to use these knobs. However, those users usually have their
> >> > own
> >> > > > custom fork of Flink, and it should not be a big problem for them to
> >> > build
> >> > > > such abilities by themselves.
> >> > > >
> >> > > >
> >> > > > Taking a step back, if we decide that some of the mentioned knobs are
> >> > > > really useful, we should at least provide enough descriptions to help
> >> > users
> >> > > > understand when and how to use these options. E.g., the current
> >> > description
> >> > > > for yarn container command template is far from enough, which does 
> >> > > > not
> >> > > > explain what the placeholders mean, what happens if some placeholders
> >> > are
> >> > > > missing or if an unknown placeholder is provided.
> >> > > >
> >> > > >
> >> > > > Best,
> >> > > >
> >> > > > Xintong
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, Dec 26, 2023 at 7:39 PM Rui Fan <1996fan...@gmail.com> wrote:
> >> > > >
> >> > > >> In addition to what is written in FLIP, I found some methods of
> >> > > >> Configuration
> >> > > >> are not necessary. And I wanna hear more thoughts from all of you.
> >> > > >>
> >> > > >> Configuration has a `public <T> T get(ConfigOption<T> option)` 
> >> > > >> method.
> >> > > >> Could we remove all `Xxx getXxx(ConfigOption<Xxx> configOption)`
> >> > methods?
> >> > > >> Such as:
> >> > > >> - public int getInteger(ConfigOption<Integer> configOption)
> >> > > >> - public String getString(ConfigOption<String> configOption)
> >> > > >> - public long getLong(ConfigOption<Long> configOption)
> >> > > >> - etc
> >> > > >>
> >> > > >> I prefer users and flink use `get(ConfigOption<T> option)` instead 
> >> > > >> of
> >> > > >> `getXxx(ConfigOption<Xxx> configOption)` based on some reasons:
> >> > > >>
> >> > > >> 1. All callers can replace it directly without any extra effort.
> >> > > >>   `T get(ConfigOption<T> option)` method can replace all
> >> > > >>   `Xxx getXxx(ConfigOption<Xxx> configOption)` methods directly.
> >> > > >> 2. Callers can call get directly, and users or flink developers
> >> > > >>   don't need to care about should they call getInteger or getString.
> >> > > >> 3. Flink code is easier to maintain.
> >> > > >> 4. `T get(ConfigOption<T> option)` is designed later than
> >> > > >>     `Xxx getXxx(ConfigOption<Xxx> configOption)`, I guess if
> >> > > >>     `T get(ConfigOption<T> option)` is designed first,
> >> > > >>    all `Xxx getXxx(ConfigOption<Xxx> configOption)` methods
> >> > > >>   aren't needed.
> >> > > >>
> >> > > >> Note: all `public void setXxx(ConfigOption<Xxx> key, Xxx value)`
> >> > methods
> >> > > >> can be replaced with `public <T> Configuration set(ConfigOption<T>
> >> > option,
> >> > > >> T value)` as well.
> >> > > >>
> >> > > >> If they can be marked as deprecated, only `public Xxx
> >> > > >> getXxx(ConfigOption<Xxx> configOption, Xxx overrideDefault)`
> >> > > >> is keeped after this FLIP. And the rest of getXxx or setXxx can be
> >> > removed
> >> > > >> in 2.0.
> >> > > >>
> >> > > >> Looking forward to everyone's feedback and suggestions, thank you!
> >> > > >>
> >> > > >> Best,
> >> > > >> Rui
> >> > > >>
> >> > > >> On Tue, Dec 26, 2023 at 7:15 PM Rui Fan <1996fan...@gmail.com> 
> >> > > >> wrote:
> >> > > >>
> >> > > >> > Thanks Xintong for the quick feedback and the good suggestions!
> >> > > >> >
> >> > > >> > > 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY`
> >> > should be
> >> > > >> > "no
> >> > > >> > > default". We can explain in the description that if not
> >> > configured,
> >> > > >> > > `System.getProperty("log.file")` will be used, but that is not a
> >> > > >> default
> >> > > >> > > value.
> >> > > >> >
> >> > > >> > Explain it in the description is fine for me.
> >> > > >> >
> >> > > >> > > 2. So I wonder if we can simply mark them as deprecated and
> >> > remove in
> >> > > >> > 2.0.
> >> > > >> >
> >> > > >> > These features don't have a public option, but they work. I'm not
> >> > sure
> >> > > >> > whether these features are used by some advanced users.
> >> > > >> > Actually, I think some of them are valuable! For example:
> >> > > >> >
> >> > > >> > - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
> >> > > >> >   allows users to define the start command of the yarn container.
> >> > > >> > - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG allows
> >> > > >> >   flink job reads all files under the directory even if it has
> >> > nested
> >> > > >> > directories.
> >> > > >> >
> >> > > >> > This FLIP focuses on the refactor option, I'm afraid these 
> >> > > >> > features
> >> > are
> >> > > >> > used
> >> > > >> > in some production and removing these features will affect some
> >> > flink
> >> > > >> jobs.
> >> > > >> > So I prefer to keep these features, WDTY?
> >> > > >> >
> >> > > >> > > 3. Simply saying "getting / setting value with string key is
> >> > > >> discouraged"
> >> > > >> > > in JavaDoc of get/setString is IMHO a bit confusing. People may
> >> > have
> >> > > >> the
> >> > > >> > > question why would we keep the discouraged interfaces at all. I
> >> > would
> >> > > >> > > suggest the following:
> >> > > >> > > ```
> >> > > >> > > We encourage users and developers to always use ConfigOption for
> >> > > >> getting
> >> > > >> > /
> >> > > >> > > setting the configurations if possible, for its rich 
> >> > > >> > > description,
> >> > > >> type,
> >> > > >> > > default-value and other supports. The string-key-based getter /
> >> > setter
> >> > > >> > > should only be used when ConfigOption is not applicable, e.g., 
> >> > > >> > > the
> >> > > >> key is
> >> > > >> > > programmatically generated in runtime.
> >> > > >> > > ```
> >> > > >> >
> >> > > >> > Suggested comment is good for me, and I'd like to hear the thought
> >> > from
> >> > > >> > Xuannan
> >> > > >> > who wrote the original comment.
> >> > > >> >
> >> > > >> > Best,
> >> > > >> > Rui
> >> > > >> >
> >> > > >> > On Tue, Dec 26, 2023 at 5:26 PM Xintong Song 
> >> > > >> > <tonysong...@gmail.com
> >> > >
> >> > > >> > wrote:
> >> > > >> >
> >> > > >> >> Thanks for the efforts, Rui and Xuannan.
> >> > > >> >>
> >> > > >> >> I think it's a good idea to migrate string-key configuration
> >> > accesses
> >> > > >> to
> >> > > >> >> ConfigOption-s in general.
> >> > > >> >>
> >> > > >> >> I have a few suggestions / questions regarding the FLIP.
> >> > > >> >>
> >> > > >> >> 1. I think the default value for `TASK_MANAGER_LOG_PATH_KEY`
> >> > should be
> >> > > >> "no
> >> > > >> >> default". We can explain in the description that if not 
> >> > > >> >> configured,
> >> > > >> >> `System.getProperty("log.file")` will be used, but that is not a
> >> > > >> default
> >> > > >> >> value.
> >> > > >> >>
> >> > > >> >> 2. I wonder if the following string-keys can be simply removed?
> >> > They
> >> > > >> are
> >> > > >> >> neither set by Flink, nor documented anywhere (AFAIK) so that 
> >> > > >> >> users
> >> > > >> know
> >> > > >> >> how to set them. All of them were introduced a long time ago,
> >> > require
> >> > > >> >> significant knowledge and familiarity about Flink internals and 
> >> > > >> >> low
> >> > > >> level
> >> > > >> >> details in order to use, and some of them are even private. So I
> >> > > >> wonder if
> >> > > >> >> we can simply mark them as deprecated and remove in 2.0.
> >> > > >> >> - ConfigConstants.YARN_CONTAINER_START_COMMAND_TEMPLATE
> >> > > >> >> - FileInputFormat.FILE_PARAMETER_KEY
> >> > > >> >> - FileInputFormat.ENUMERATE_NESTED_FILES_FLAG
> >> > > >> >> - FileOutputFormat.FILE_PARAMETER_KEY
> >> > > >> >> - BinaryInputFormat.BLOCK_SIZE_PARAMETER_KEY
> >> > > >> >> - BinaryOutputFormat.BLOCK_SIZE_PARAMETER_KEY
> >> > > >> >>
> >> > > >> >> 3. Simply saying "getting / setting value with string key is
> >> > > >> discouraged"
> >> > > >> >> in JavaDoc of get/setString is IMHO a bit confusing. People may
> >> > have
> >> > > >> the
> >> > > >> >> question why would we keep the discouraged interfaces at all. I
> >> > would
> >> > > >> >> suggest the following:
> >> > > >> >> ```
> >> > > >> >> We encourage users and developers to always use ConfigOption for
> >> > > >> getting /
> >> > > >> >> setting the configurations if possible, for its rich description,
> >> > type,
> >> > > >> >> default-value and other supports. The string-key-based getter /
> >> > setter
> >> > > >> >> should only be used when ConfigOption is not applicable, e.g., 
> >> > > >> >> the
> >> > key
> >> > > >> is
> >> > > >> >> programmatically generated in runtime.
> >> > > >> >> ```
> >> > > >> >>
> >> > > >> >> WDYT?
> >> > > >> >>
> >> > > >> >> Best,
> >> > > >> >>
> >> > > >> >> Xintong
> >> > > >> >>
> >> > > >> >>
> >> > > >> >>
> >> > > >> >> On Tue, Dec 26, 2023 at 4:12 PM Rui Fan <1996fan...@gmail.com>
> >> > wrote:
> >> > > >> >>
> >> > > >> >> > Hi all,
> >> > > >> >> >
> >> > > >> >> > Xuannan(cced) and I would like to start a discussion on 
> >> > > >> >> > FLIP-405:
> >> > > >> >> > FLIP-405: Migrate string configuration key to ConfigOption[1].
> >> > > >> >> >
> >> > > >> >> > As Flink progresses to 2.0, we want to enhance the user
> >> > experience
> >> > > >> >> > with the existing configuration. In FLIP-77, Flink introduced
> >> > > >> >> ConfigOption
> >> > > >> >> > with DataType and strongly encourage users to utilize
> >> > ConfigOption
> >> > > >> >> > instead of string keys for accessing and setting Flink
> >> > > >> configurations.
> >> > > >> >> > Presently, many string configuration keys have been deprecated
> >> > and
> >> > > >> >> > replaced with ConfigOptions; however, some string configuration
> >> > > >> >> > keys are still in use.
> >> > > >> >> >
> >> > > >> >> > To ensure a better experience with the existing configuration 
> >> > > >> >> > in
> >> > > >> Flink
> >> > > >> >> > 2.0,
> >> > > >> >> > this FLIP will migrate all user-facing string configuration 
> >> > > >> >> > keys
> >> > to
> >> > > >> >> > ConfigOptions.
> >> > > >> >> > Additionally, we want to modify the Configuration 
> >> > > >> >> > infrastructure
> >> > to
> >> > > >> >> > promote the use of ConfigOption over string configuration keys
> >> > > >> >> > among developers and users. It's mentioned in a preview
> >> > thread[2].
> >> > > >> >> >
> >> > > >> >> > Looking forward to everyone's feedback and suggestions, thank
> >> > you!
> >> > > >> >> >
> >> > > >> >> > [1] https://cwiki.apache.org/confluence/x/6Yr5E
> >> > > >> >> > [2]
> >> > https://lists.apache.org/thread/zzsf7glfcdjcjm1hfo1xdwc6jp37nb3m
> >> > > >> >> >
> >> > > >> >> > Best,
> >> > > >> >> > Rui
> >> > > >> >> >
> >> > > >> >>
> >> > > >> >
> >> > > >>
> >> > > >
> >> >

Reply via email to