Let me try to summarize the current state of the discussion. I think we now 
have a few competing approaches.

# Approach 1: dynamic env var mapping

* This is the approach currently outlined in the FLIP
* The assumption that we only have dots and hyphens in config option keys seem 
to be wrong, e.g. custom backends or metrics reporters (see Chesnay's responses)

Due to the 2nd point, this option seems to be not viable. We could opt to only 
support the subset of keys that satisfy the restriction on their key, but this 
could be confusing to users.

I'm reading our discussion as rejecting this approach due to its limitations.

# Approach 2: single DYNAMIC_PROPERTIES env var

* Parse DYNAMIC_PROPERTIES env var during configuration load time
* The provided value may include multiple config option entries, similar to our 
docker-entrypoint script 
(https://github.com/apache/flink-docker/blob/master/1.12/scala_2.12-java8-debian/docker-entrypoint.sh#L79-L91)

This option would essentially canonicalize our existing docker-entrypoint 
script solution (which does not suffer from any limitations on the keys).

I think everybody in the discussion agrees that configuration of multiple 
options as a single value is cumbersome, but besides that everybody seems to be 
generally on board with this approach.

# Approach 3: main args-based dynamic properties (do nothing)

* No change required
* Map env vars to config options via command line args dynamic properties, e.g. 
-Dkey=$(VALUE)

I think this is the approach that Ingo now favours. The main question seems to 
be whether this is good enough and whether it works in all contexts we care 
about.

# Approach 4: substitute flink-conf.yaml values only

* flink-conf.yaml entries can reference env vars as values

This approach was rejected in initial DISCUSS thread, but there seems to be 
some unanswered questions here.

Roman brought up the question of providing a default mapping for all known keys 
which would allow individual env vars to configure most options out of the box. 
If an option is not part of the default mapping, users would be able to still 
define the mapping manually. The main down side seems to be that we would have 
to maintain the default mapping somehow. This is something we had  rejected 
earlier as too fragile.

# Summary

A few questions in order to move this thread forward:

* Do you agree that approach 1 has been rejected?
* Is there any objection to approach 2?
* Did we consider whether approach 3 is good enough?
* Is approach 4 a viable option after all? I think we should continue the 
discussion around the questions that were brought up.

I hope that we can reach a conclusion soon.

– Ufuk

On Wed, Jan 27, 2021, at 10:45 AM, Khachatryan Roman wrote:
> Hi Ingo,
> I missed that part of the discussion, sorry about that.
> Would you mind putting it to the FLIP page?
> 
> I guess you are referring to this message:
> > I don't think we can assume that we know all config option keys. For
> instance, I might write a custom high availability service or a custom
> FileSystem plugin that has it's own config options. It would be a pity (but
> maybe tolerable) if env var config would only work with Flink's core
> options.
> Can it be solved by adding an annotating and then scanning classpath via
> reflection?
> This is not ideal, but knowing all options in advance might be a right step
> towards eager config evaluation.
> 
> @Chesnay and Thomas
> Thanks for the clarification. IIUC, the format would be
> DYNAMIC_PROPERTIES='-Drest.port=1234 -Dother.option="iContainAn=Sign"'.
> But the evaluation WILL have nothing to do with System Properties as it
> will happen already inside the java process, right?
> This may be a bit confusing for users (but probably not a big issue though).
> 
> As for the substitution in file, Yang wrote:
> > It is true that we need to maintain a flink configuration file which
> simply maps all keys to some environment variables. But
> > for Yarn and K8s deployment(both standalone on K8s and native), we
> already have such a file, which is shipped from client
> > or mounted from a ConfigMap.
> Can we have a default file in Flink on the classpath with keys already
> mapped to env vars? (as I wrote in the very beginning).
> 
> Regards,
> Roman
> 
> 
> On Wed, Jan 27, 2021 at 9:24 AM Ingo Bürk <i...@ververica.com> wrote:
> 
> > Hi Yang,
> >
> > yeah, this is essentially the solution we also use in Ververica Platform
> > (except that we use ${…}). I can't really add anything other than the
> > previously stated reasons for why we decided to reject this, but of course
> > I'm also open to this solution still. I don't think Flink necessarily needs
> > to maintain a flink-conf.yaml with all environment variables; I would
> > assume users generally would want to override only some properties, not
> > most of them, and for a config like that basically any EV not set would end
> > up causing an error.
> >
> >
> > Regards
> > Ingo
> >
> > On Wed, Jan 27, 2021 at 4:42 AM Yang Wang <danrtsey...@gmail.com> wrote:
> >
> >> Hi all, sorry to butt in.
> >>
> >> I am little curious about why do we have to do the overwritten via
> >> environment variables after
> >> loading the configuration from file. Could we support to do the
> >> substitution while loading the
> >> "flink-conf.yaml" file?
> >>
> >> For example, I have the following config options in my flink-conf.yaml.
> >> fs.oss.accessKeyId: $(FS_OSS_ACCESS_KEY_ID)
> >> fs.oss.accessKeySecret: $(FS_OSS_ACCESS_KEY_SECRET)
> >>
> >> I expect these environment variables could be substituted when loading the
> >> configuration file. It is
> >> very similar to use "*envsubst < /tmp/flink-conf.yaml >
> >> /tmp/flink-conf-1.yaml*".
> >>
> >> I know this is a rejected alternative. But I think some reasons could not
> >> stand on.
> >> * We could use $(FS_OSS_ACCESS_KEY_ID) instead of ${FS_OSS_ACCESS_KEY_ID}
> >> for the environment definition
> >> to avoid escape issues. I think the Kubernetes has the same behavior[1].
> >> Maybe many users are already familiar with it.
> >> * Users do not need to know the conversion between flink config option and
> >> environment name. Because they could use
> >> any name they want.
> >> * It is true that we need to maintain a flink configuration file
> >> which simply maps all keys to some environment variables. But
> >> for Yarn and K8s deployment(both standalone on K8s and native), we already
> >> have such a file, which is shipped from client
> >> or mounted from a ConfigMap.
> >>
> >>
> >> @Ingo This solution could perfectly work with Kubernetes deployment and is
> >> easier to use. We could use a ConfigMap for
> >> storing the flink-conf.yaml, and using secrets to exposed as environment
> >> variables for the authentication informations.
> >>
> >>
> >> [1].
> >>
> >> https://kubernetes.io/docs/tasks/inject-data-application/define-environment-variable-container/#using-environment-variables-inside-of-your-config
> >>
> >>
> >> Best,
> >> Yang
> >>
> >> Chesnay Schepler <ches...@apache.org> 于2021年1月27日周三 上午8:03写道:
> >>
> >> > In the end DYNAMIC_PROPERTIES behaves exactly like env.java.opts;
> >> > meaning that the existing rules set by the JVM apply.
> >> >
> >> > Here's an example: export DYNAMIC_PROPERTIES='-Drest.port=1234
> >> > -Dother.option="iContainAn=Sign"'
> >> >
> >> > This would then be appended as is to the /java/ command.
> >> > (
> >> >      Conceptually at least; shells are annoying when it comes to
> >> > quotes/whitespace;  good old http://mywiki.wooledge.org/BashFAQ/050.
> >> >      Something like java ... $(eval echo ${DYNAMIC_PROPERTIES} handles
> >> > quotes nicely, but no spaces in values.
> >> >
> >> >      We should really move more logic from the scripts into the
> >> > BashJavaUtils...
> >> > )
> >> >
> >> > On 1/26/2021 11:17 PM, Khachatryan Roman wrote:
> >> > >> Here's an example: My option key is custom.my_backend_option. With
> >> the
> >> > >> current design, the corresponding env variable would be
> >> > >> CUSTOM_MY_BACKEND_OPTION, which would be converted into
> >> > >> custom.my.backend.option .
> >> > > I think we don't have to translate CUSTOM_MY_BACKEND_OPTION back.
> >> > Instead,
> >> > > we should use the key from the ConfigOption.
> >> > > I'm assuming that not  every ENV VAR will end up in the Configuration
> >> -
> >> > > only those for which a matching ConfigOptions is found.
> >> > >
> >> > > I'm also fine with a single ENV VAR (DYNAMIC_PROPERTIES). It's
> >> already a
> >> > > big improvement.
> >> > > In the future, we can consider adding smth like
> >> ConfigOption.withEnvVar
> >> > for
> >> > > some (most popular) options.
> >> > >
> >> > > However, escaping is still not clear to me: how would kv-pairs be
> >> > > separated? What if such a separator is in the value itself? What if
> >> '='
> >> > is
> >> > > in the value?
> >> > > Or am I missing something?
> >> > >
> >> > > Regards,
> >> > > Roman
> >> > >
> >> > >
> >> > > On Tue, Jan 26, 2021 at 6:41 PM Till Rohrmann <trohrm...@apache.org>
> >> > wrote:
> >> > >
> >> > >> Thinking a bit more about the DYNAMIC_PROPERTIES, I have to admit
> >> that I
> >> > >> like the fact that it works around the problem of encoding the key
> >> names
> >> > >> and that it is more powerful wrt to bulk changes. Also the fact that
> >> one
> >> > >> can copy past configuration snippets is quite useful. Given these
> >> > aspects
> >> > >> and that we wouldn't exclude any mentioned configuration scenarios, I
> >> > would
> >> > >> also be ok following this approach given that we support it for all
> >> > Flink
> >> > >> processes.
> >> > >>
> >> > >> Cheers,
> >> > >> Till
> >> > >>
> >> > >> On Tue, Jan 26, 2021 at 5:10 PM Ingo Bürk <i...@ververica.com>
> >> wrote:
> >> > >>
> >> > >>> Hi everyone,
> >> > >>>
> >> > >>> thanks for the livid discussion, it's great to see so many opinions
> >> and
> >> > >>> ideas!
> >> > >>>
> >> > >>> The point about underscores is a very valid point where the current
> >> > FLIP,
> >> > >>> if we were to stick with it, would have to be improved. I was going
> >> to
> >> > say
> >> > >>> that we should exclude that from the discussion about the merits of
> >> > >>> different overall solutions, but I am afraid that this makes the
> >> "how
> >> > to
> >> > >>> name EVs" question even more convoluted, and that in turn directly
> >> > impacts
> >> > >>> the usefulness of the FLIP as a whole which is about a more
> >> convenient
> >> > way
> >> > >>> of configuring Flink; names which are too cryptic will not achieve
> >> > that.
> >> > >>> So
> >> > >>> in this regard I am in agreement with Chesnay.
> >> > >>>
> >> > >>> After all these considerations, speaking from the Kubernetes
> >> context,
> >> > it
> >> > >>> seems to me that using the dynamic properties works best (I can use
> >> > config
> >> > >>> key names as-is) and requires no change, so I'm actually just
> >> leaning
> >> > >>> towards that. However, the Kubernetes context is, I guess, not the
> >> only
> >> > >>> one
> >> > >>> to consider.
> >> > >>>
> >> > >>>
> >> > >>> Best regards
> >> > >>> Ingo
> >> > >>>
> >> > >>> On Tue, Jan 26, 2021 at 3:48 PM Chesnay Schepler <
> >> ches...@apache.org>
> >> > >>> wrote:
> >> > >>>
> >> > >>>> Mind you that we could of course solve these character issues by
> >> first
> >> > >>>> nailing down which characters we allow in keys (presumably:
> >> > [a-z0-9-.]).
> >> > >>>>
> >> > >>>> On 1/26/2021 3:45 PM, Chesnay Schepler wrote:
> >> > >>>>> Here's an example: My option key is custom.my_backend_option. With
> >> > the
> >> > >>>>> current design, the corresponding env variable would be
> >> > >>>>> CUSTOM_MY_BACKEND_OPTION, which would be converted into
> >> > >>>>> custom.my.backend.option .
> >> > >>>>>
> >> > >>>>> It is true that users could still parse the original system
> >> property
> >> > >>>>> as a fall-back, but it seems to partially invalidate the goal and
> >> > >>>>> introduce the potential for surprises and inconsistent behavior.
> >> > >>>>>
> >> > >>>>> What would happen if the option were already defined in the
> >> > >>>>> flink-conf.yaml, but overwritten with the env variable? Users
> >> would
> >> > >>>>> have to check every time they access a configuration whether the
> >> > >>>>> system property was also set and resolve things manually.
> >> Naturally
> >> > >>>>> things might also conflict with whatever prioritization we come up
> >> > >>> with.
> >> > >>>>> Now you might say that this is only necessary if the option
> >> contains
> >> > >>>>> special characters, but then we're setting users up for a surprise
> >> > >>>>> should they ever rename an existing option to contain an
> >> underscore.
> >> > >>>>>
> >> > >>>>> As for newlines, I wouldn't expect newline characters to appear
> >> > within
> >> > >>>>> DYNAMIC_VARIABLE, but I guess it would follow the same behavior
> >> as if
> >> > >>>>> you would declare them on the command-line?
> >> > >>>>>
> >> > >>>>> One more downside I see is that from a users perspective I'd
> >> always
> >> > >>>>> have to do this conversion manually. You can't just copy stuff
> >> from
> >> > >>>>> the documentation (unless we duplicate every single mention), nor
> >> can
> >> > >>>>> you easily switch between environment variables and dynamic
> >> > >>>>> properties/flink-conf.yaml . For the use-cases that people seems
> >> to
> >> > be
> >> > >>>>> concerned about (where you have lots of variables) I would think
> >> that
> >> > >>>>> this is a deal-breaker.
> >> > >>>>>
> >> > >>>>> On 1/26/2021 2:59 PM, Khachatryan Roman wrote:
> >> > >>>>>> @Chesnay
> >> > >>>>>> could you explain how underscores in user-defined properties
> >> would
> >> > be
> >> > >>>>>> affected with transformation like STATE_BACKEND -> state.backend?
> >> > >>>>>> IIUC, this transformation happens in Flink and doesn't alter ENV
> >> > >>>>>> vars, so
> >> > >>>>>> the user app can still parse the original configuration.
> >> > >>>>>> OTH, I'm a bit concerned whether the newline should be escaped by
> >> > the
> >> > >>>>>> user
> >> > >>>>>> in DYNAMIC_VARIABLES.
> >> > >>>>>>
> >> > >>>>>> @Ingo Bürk <i...@ververica.com>
> >> > >>>>>> I feel a bit lost in the discussion) Maybe we can put an
> >> > intermediate
> >> > >>>>>> summary of pros and cons of different approaches into the FLIP?
> >> > >>>>>>
> >> > >>>>>> And for completeness, we could combine DYNAMIC_VARIABLES approach
> >> > >>> with
> >> > >>>>>> passing individual variables.
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> Regards,
> >> > >>>>>> Roman
> >> > >>>>>>
> >> > >>>>>>
> >> > >>>>>> On Tue, Jan 26, 2021 at 12:54 PM Chesnay Schepler <
> >> > >>> ches...@apache.org>
> >> > >>>>>> wrote:
> >> > >>>>>>
> >> > >>>>>>> I think we have to assume that some user has a custom config
> >> option
> >> > >>>>>>> that
> >> > >>>>>>> uses underscores.
> >> > >>>>>>>
> >> > >>>>>>> That said, we can probably assume that no one uses other special
> >> > >>>>>>> characters like question marks and such, which are indeed
> >> allowed
> >> > by
> >> > >>>>>>> the
> >> > >>>>>>> YAML spec.
> >> > >>>>>>>
> >> > >>>>>>> These kind of questions are precisely why I prefer the
> >> > >>>>>>> DYNAMIC_VARIABLES
> >> > >>>>>>> approach; you don't even have to worry about this stuff.
> >> > >>>>>>> The only question we'd have to answer is whether manually
> >> defined
> >> > >>>>>>> properties should take precedent or not.
> >> > >>>>>>>
> >> > >>>>>>> @Uce I can see how it could be cumbersome to modify, but at the
> >> > same
> >> > >>>>>>> time you can implement whatever other approach you want on top
> >> of
> >> > >>> it:
> >> > >>>>>>> // this is a /conceptual /example for an optional setting
> >> > >>>>>>> DYNAMIC_VARIABLES="${REST_PORT_SETTING}"
> >> > >>>>>>> if _someCondition_:
> >> > >>>>>>>      export REST_PORT_SETTING="-Drest.port=1234"
> >> > >>>>>>>
> >> > >>>>>>> // this is a /conceptual /example for a configurable setting
> >> > >>>>>>> DYNAMIC_VARIABLES="-Drest.port=${MY_FANCY_VARIABLE:-8200}"
> >> > >>>>>>> if _someCondition_:
> >> > >>>>>>>      export MY_FANCY_VARIABLE="1234"
> >> > >>>>>>>
> >> > >>>>>>> Additionally, this makes it quite easy to audit stuff, since we
> >> can
> >> > >>>>>>> just
> >> > >>>>>>> eagerly log what DYNAMIC_VARIABLES is set to.
> >> > >>>>>>>
> >> > >>>>>>> On 1/26/2021 12:48 PM, Xintong Song wrote:
> >> > >>>>>>>> @Ufuk,
> >> > >>>>>>>> I also don't find any existing options with underscores in
> >> their
> >> > >>> keys.
> >> > >>>>>>>> However, I do not find any explicit rules forbid using them
> >> > either.
> >> > >>>>>>>> I'm
> >> > >>>>>>> not
> >> > >>>>>>>> saying this should block the FLIP. Just it would be nice to
> >> beware
> >> > >>> of
> >> > >>>>>>> this
> >> > >>>>>>>> issue, and maybe ensure the assumption with test cases if we
> >> > >>> finally
> >> > >>>>>>> decide
> >> > >>>>>>>> to go with these mapping rules.
> >> > >>>>>>>>
> >> > >>>>>>>> Thank you~
> >> > >>>>>>>>
> >> > >>>>>>>> Xintong Song
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>>
> >> > >>>>>>>> On Tue, Jan 26, 2021 at 7:27 PM Ufuk Celebi <u...@apache.org>
> >> > >>> wrote:
> >> > >>>>>>>>> @Xingtong: The assumption for the mapping was that we only
> >> have
> >> > >>>>>>>>> dots and
> >> > >>>>>>>>> hyphens in the keys. Do you have an example for a key which
> >> > >>> include
> >> > >>>>>>>>> underscores? If underscores are common for keys (I couldn't
> >> find
> >> > >>> any
> >> > >>>>>>>>> existing options that use it), it would certainly be a blocker
> >> > for
> >> > >>>>>>>>> the
> >> > >>>>>>>>> discussed approach.
> >> > >>>>>>>>>
> >> > >>>>>>>>> On Tue, Jan 26, 2021, at 11:46 AM, Xintong Song wrote:
> >> > >>>>>>>>>>       - The naming conversions proposed in the FLIP seems not
> >> > >>>>>>>>>> bijective
> >> > >>>>>>> to
> >> > >>>>>>>>> me.
> >> > >>>>>>>>>>       There could be problems if the configuration key
> >> contains
> >> > >>>>>>> underscores.
> >> > >>>>>>>>>>          - a_b -> FLINK_CONFIG_a_b
> >> > >>>>>>>>>>          - FLINK_CONFIG_a_b -> a.b
> >> > >>>>>>>
> >> > >>>>
> >> >
> >> >
> >>
> >
> 

Reply via email to