Hi Ufuk, Till,

I definitely agree that having the Configuration be (or at least feel)
immutable and complete seems like a better choice, and it is probably worth
the trade-off in EV naming flexibility. Let me reshape the FLIP to propose
something along the lines of solution (3).

Regarding env.java.opts, what special handling is needed there? AFAICT only
the rejected alternative of substituting values would've had an effect on
this.


Regards
Ingo

On Thu, Jan 21, 2021 at 11:13 AM Ufuk Celebi <u...@apache.org> wrote:

> Thanks for starting the discussion, Ingo!
>
> Regarding approach 1:
>
> I like the idea of having a mapping scheme from ConfigOption to env
> var(s), but I'm concerned about the implications of lazy eval. I think it
> would be preferable to keep the Configuration object as the source of
> truth, requiring us to do some form of eager evaluation.
>
> Regarding approach 2:
>
> I don't think we can assume that we know all config option keys. For
> instance, I might write a custom high availability service or a custom
> FileSystem plugin that has it's own config options. It would be a pity (but
> maybe tolerable) if env var config would only work with Flink's core
> options.
>
> Regarding approach 3:
>
> What do you think about a mapping like
> a) stripping the FLINK_CONFIG_ prefix,
> b) replacing every _ with a dot,
> c) replacing every __ with a hyphen,
> d) lowercasing* everything?
>
> Some examples for options that include both dots and hyphens:
>
> akka.client-socket-worker-pool.pool-size-factor =>
>   FLINK_CONFIG_AKKA_CLIENT__SOCKET__WORKER__POOL_POOL__SIZE__FACTOR
>
> high-availability.zookeeper.quorum =>
>   FLINK_CONFIG_HIGH__AVAILABILITY_ZOOKEEPER_QUORUM
>
> It's not ideal, but easy to understand assuming that dots and hyphens are
> the only special characters in config keys.
>
> Regarding the lower-casing step above: ConfigOption keys seem to be case
> sensitive internally, but I couldn't find any user-facing documentation for
> this. There should be no options that depends on this behaviour. So if I'm
> not overlooking anything, I think it should be fine to make it case
> insensitive internally when accessing the raw value of a ConfigOption.
>
> In addition, I think the FLIP should mention special cases such as
> env.java.opts that are evaluated in the bash scripts and not in the Java
> code.
>
> Cheers,
>
> Ufuk
>
> On Thu, Jan 21, 2021, at 8:57 AM, Ingo Bürk wrote:
> > Hi everyone,
> >
> > I've now started a FLIP and am opening this discussion thread. Very much
> > looking forward to your feedback!
> >
> > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg
> >
> > The first big point I'd like to discuss is about the mechanism of "when"
> > the EVs (environment variables) are looked up. I'll give three approaches
> > here, the first of which is currently in the FLIP but very much open for
> > change, and of course I'm happy to hear about different ideas entirely as
> > well.
> >
> > 1) Lazy evaluation
> >
> > Only look up the EVs when an actual config key is requested from
> > Configuration(#getRawValue), possibly with the addition of caching it
> once
> > it has been looked up.
> > The main benefit here is that no a-priori knowledge of available keys is
> > required. The downside is that at no point in time we have complete
> > knowledge of the configuration. This currently only really affects
> > Configuration#keySet, but it does impose a limitation on future
> development
> > worth considering. It also changes Configuration which is not limited to
> > the Flink configuration, though this can easily be turned into an
> optional
> > feature of Configuration.
> >
> > 2) Eager evaluation with full information
> >
> > If we centrally collect all possible Flink configuration keys in
> flink-core
> > (quite a lot seem to be available already, but not all), we'd have
> complete
> > information and could eagerly evaluate the environment, the precedence
> > rules and populate the Configuration object accordingly. It also contains
> > the implementation entirely to GlobalConfiguration only.
> > The downside is, however, that this shifts the design a bit of having to
> > know possible keys upfront. I'm also not sure how much effort it would be
> > to collect all information in flink-core, or how "spread" this is across
> > the codebase.
> >
> > 3) Eager evaluation through bijective mapping
> >
> > If we deviate from the Spring-style naming of the EVs we could
> potentially
> > come up with a scheme that provides a bijection between EVs and config
> > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or
> > anything to that extent), we could take all EVs with that prefix, map
> them
> > to the corresponding config key name and eagerly populate Configuration.
> > The main challenge is now defining this bijection, and we'd lose some
> > "flexibility" in the naming of the EVs, so we'd end up with something
> like
> > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty.
> >
> > Happy to hear your thoughts on this!
> >
> >
> > Regards
> > Ingo
> >
>

Reply via email to