Hi everyone,

I've now started a FLIP and am opening this discussion thread. Very much
looking forward to your feedback!

FLIP: https://cwiki.apache.org/confluence/x/ngtRCg

The first big point I'd like to discuss is about the mechanism of "when"
the EVs (environment variables) are looked up. I'll give three approaches
here, the first of which is currently in the FLIP but very much open for
change, and of course I'm happy to hear about different ideas entirely as
well.

1) Lazy evaluation

Only look up the EVs when an actual config key is requested from
Configuration(#getRawValue), possibly with the addition of caching it once
it has been looked up.
The main benefit here is that no a-priori knowledge of available keys is
required. The downside is that at no point in time we have complete
knowledge of the configuration. This currently only really affects
Configuration#keySet, but it does impose a limitation on future development
worth considering. It also changes Configuration which is not limited to
the Flink configuration, though this can easily be turned into an optional
feature of Configuration.

2) Eager evaluation with full information

If we centrally collect all possible Flink configuration keys in flink-core
(quite a lot seem to be available already, but not all), we'd have complete
information and could eagerly evaluate the environment, the precedence
rules and populate the Configuration object accordingly. It also contains
the implementation entirely to GlobalConfiguration only.
The downside is, however, that this shifts the design a bit of having to
know possible keys upfront. I'm also not sure how much effort it would be
to collect all information in flink-core, or how "spread" this is across
the codebase.

3) Eager evaluation through bijective mapping

If we deviate from the Spring-style naming of the EVs we could potentially
come up with a scheme that provides a bijection between EVs and config
keys. If keys are further prefixed with something like FLINK_CONFIG_ (or
anything to that extent), we could take all EVs with that prefix, map them
to the corresponding config key name and eagerly populate Configuration.
The main challenge is now defining this bijection, and we'd lose some
"flexibility" in the naming of the EVs, so we'd end up with something like
"$FLINK_CONFIG_s3__access_key", which arguably doesn't look very pretty.

Happy to hear your thoughts on this!


Regards
Ingo

Reply via email to