Hi everyone,

I've updated the FLIP (https://cwiki.apache.org/confluence/x/ngtRCg)
according to these discussions.


Regards
Ingo

On Thu, Jan 21, 2021 at 11:37 AM Ingo Bürk <i...@ververica.com> wrote:

> Hi Ufuk, Till,
>
> I definitely agree that having the Configuration be (or at least feel)
> immutable and complete seems like a better choice, and it is probably worth
> the trade-off in EV naming flexibility. Let me reshape the FLIP to propose
> something along the lines of solution (3).
>
> Regarding env.java.opts, what special handling is needed there? AFAICT
> only the rejected alternative of substituting values would've had an effect
> on this.
>
>
> Regards
> Ingo
>
> On Thu, Jan 21, 2021 at 11:13 AM Ufuk Celebi <u...@apache.org> wrote:
>
>> Thanks for starting the discussion, Ingo!
>>
>> Regarding approach 1:
>>
>> I like the idea of having a mapping scheme from ConfigOption to env
>> var(s), but I'm concerned about the implications of lazy eval. I think it
>> would be preferable to keep the Configuration object as the source of
>> truth, requiring us to do some form of eager evaluation.
>>
>> Regarding approach 2:
>>
>> I don't think we can assume that we know all config option keys. For
>> instance, I might write a custom high availability service or a custom
>> FileSystem plugin that has it's own config options. It would be a pity (but
>> maybe tolerable) if env var config would only work with Flink's core
>> options.
>>
>> Regarding approach 3:
>>
>> What do you think about a mapping like
>> a) stripping the FLINK_CONFIG_ prefix,
>> b) replacing every _ with a dot,
>> c) replacing every __ with a hyphen,
>> d) lowercasing* everything?
>>
>> Some examples for options that include both dots and hyphens:
>>
>> akka.client-socket-worker-pool.pool-size-factor =>
>>   FLINK_CONFIG_AKKA_CLIENT__SOCKET__WORKER__POOL_POOL__SIZE__FACTOR
>>
>> high-availability.zookeeper.quorum =>
>>   FLINK_CONFIG_HIGH__AVAILABILITY_ZOOKEEPER_QUORUM
>>
>> It's not ideal, but easy to understand assuming that dots and hyphens are
>> the only special characters in config keys.
>>
>> Regarding the lower-casing step above: ConfigOption keys seem to be case
>> sensitive internally, but I couldn't find any user-facing documentation for
>> this. There should be no options that depends on this behaviour. So if I'm
>> not overlooking anything, I think it should be fine to make it case
>> insensitive internally when accessing the raw value of a ConfigOption.
>>
>> In addition, I think the FLIP should mention special cases such as
>> env.java.opts that are evaluated in the bash scripts and not in the Java
>> code.
>>
>> Cheers,
>>
>> Ufuk
>>
>> On Thu, Jan 21, 2021, at 8:57 AM, Ingo Bürk wrote:
>> > Hi everyone,
>> >
>> > I've now started a FLIP and am opening this discussion thread. Very much
>> > looking forward to your feedback!
>> >
>> > FLIP: https://cwiki.apache.org/confluence/x/ngtRCg
>> >
>> > The first big point I'd like to discuss is about the mechanism of "when"
>> > the EVs (environment variables) are looked up. I'll give three
>> approaches
>> > here, the first of which is currently in the FLIP but very much open for
>> > change, and of course I'm happy to hear about different ideas entirely
>> as
>> > well.
>> >
>> > 1) Lazy evaluation
>> >
>> > Only look up the EVs when an actual config key is requested from
>> > Configuration(#getRawValue), possibly with the addition of caching it
>> once
>> > it has been looked up.
>> > The main benefit here is that no a-priori knowledge of available keys is
>> > required. The downside is that at no point in time we have complete
>> > knowledge of the configuration. This currently only really affects
>> > Configuration#keySet, but it does impose a limitation on future
>> development
>> > worth considering. It also changes Configuration which is not limited to
>> > the Flink configuration, though this can easily be turned into an
>> optional
>> > feature of Configuration.
>> >
>> > 2) Eager evaluation with full information
>> >
>> > If we centrally collect all possible Flink configuration keys in
>> flink-core
>> > (quite a lot seem to be available already, but not all), we'd have
>> complete
>> > information and could eagerly evaluate the environment, the precedence
>> > rules and populate the Configuration object accordingly. It also
>> contains
>> > the implementation entirely to GlobalConfiguration only.
>> > The downside is, however, that this shifts the design a bit of having to
>> > know possible keys upfront. I'm also not sure how much effort it would
>> be
>> > to collect all information in flink-core, or how "spread" this is across
>> > the codebase.
>> >
>> > 3) Eager evaluation through bijective mapping
>> >
>> > If we deviate from the Spring-style naming of the EVs we could
>> potentially
>> > come up with a scheme that provides a bijection between EVs and config
>> > keys. If keys are further prefixed with something like FLINK_CONFIG_ (or
>> > anything to that extent), we could take all EVs with that prefix, map
>> them
>> > to the corresponding config key name and eagerly populate Configuration.
>> > The main challenge is now defining this bijection, and we'd lose some
>> > "flexibility" in the naming of the EVs, so we'd end up with something
>> like
>> > "$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty.
>> >
>> > Happy to hear your thoughts on this!
>> >
>> >
>> > Regards
>> > Ingo
>> >
>>
>

Reply via email to