Hi Fred,

thanks for starting this discussion. I totally agree that this an issue that the community should solve. It popped up before and is still unsolved today. Great that you offer your help here. So let's clarify the implementation details.

1) Global vs. Local solution

Is this a DDL-only problem? If yes, it would be easier to solve it in the `FactoryUtil` that all Flink connectors and formats use.

2) Configruation vs. enviornment variables

I agree with Qingsheng that environment variable are not always straightforward to identify if you have a "pre-flight phase" and a "cluster phase". In the DynamicTableFactory, one has access to Flink configuration and could resolve `${...}` variables.


What do you think?

Regards,
Timo


Am 01.04.22 um 12:26 schrieb Qingsheng Ren:
Hi Fred,

Thanks for raising the discussion! I think the definition of “environment 
variable” varies under different context. Under Flink on K8s it means the 
environment variable for a container, and if you are a SQL client user it could 
refer to environment variable of SQL client, or even the system properties on 
JVM. So using “environment variable” is a bit vague under different 
environments.

A more generic solution in my mind is that we can take advantage of 
configurations in Flink, to pass table options dynamically by adding configs to 
TableConfig or even flink-conf.yaml. For example option 
“table.dynamic.options.my_catalog.my_db_.my_table.accessId = foo” means adding 
table option “accessId = foo” to table “my_catalog.my_db.my_table”. By this way 
we could de-couple DDL statement with table options containing secret 
credentials. What do you think?

Best regards,

Qingsheng

On Mar 30, 2022, at 16:25, Teunissen, F.G.J. (Fred) 
<fred.teunis...@ing.com.INVALID> wrote:

Hi devs,

Some SQL Table properties contain sensitive data, like passwords that we do not 
want to expose in the VVP ui to other users. Also, having them clear text in a 
SQL statement is not secure. For example,

CREATE TABLE Orders (
    `user` BIGINT,
    product STRING,
    order_time TIMESTAMP(3)
) WITH (
    'connector' = 'kafka',

    'properties.bootstrap.servers' = 'kafka-host-1:9093,kafka-host-2:9093',
    'properties.security.protocol' = 'SSL',
    'properties.ssl.key.password' = 'should-be-a-secret',
    'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
    'properties.ssl.keystore.password' = 'should-also-be-a-secret',
    'properties.ssl.truststore.location' = '/tmp/secrets/my-truststore.jks',
    'properties.ssl.truststore.password' = 'should-again-be-a-secret',
    'scan.startup.mode' = 'earliest-offset'
);

I would like to bring up for a discussion a proposal to provide these secrets 
values via environment variables since these can be populated from a K8s 
configMap or secrets.

For implementing the SQL Table properties, the ConfigOption<T> class is used in 
connectors and formatters. This class could be extended that it checks whether the 
config-value contains certain tokens, like ‘${env-var-name}’. If it does, it could 
fetch the value from the environment variable and use that to replace that token in 
the config-value.

The above SQL statement would then look like,

CREATE TABLE Orders (
    `user` BIGINT,
    product STRING,
    order_time TIMESTAMP(3)
) WITH (
    'connector' = 'kafka',

    'properties.bootstrap.servers' = 'kafka-host-1:9093,kafka-host-2:9093',
    'properties.security.protocol' = 'SSL',
    'properties.ssl.key.password' = '${secret_kafka_ssl_key_password}',
    'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
    'properties.ssl.keystore.password' = 
'${secret_kafka_ssl_keystore_password}',
    'properties.ssl.truststore.location' = '/tmp/secrets/my-truststore.jks',
    'properties.ssl.truststore.password' = 
'${secret_kafka_ssl_truststore_password}',
    'scan.startup.mode' = 'earliest-offset'
);

For the purpose of secrets I don’t think you need any complex processing of 
tokens but perhaps there are other usages as well. For instance,

    'properties.bootstrap.servers' = 
'kafka-${otap_env}-1:9093,kafka-${otap_env}-2:9093',

Because it is possible that (but I think unlikely) someone wants a property 
value like ‘${not-an-env-var}’ you need to be able to escape this ’$’ token 
like ‘$${not-an-env-var}’. This also means that in theory it would break 
compatibility.

Looking forward for your feedback!

Best,
Fred Teunissen

-----------------------------------------------------------------
ATTENTION:
The information in this e-mail is confidential and only meant for the intended 
recipient. If you are not the intended recipient, don't use or disclose it in 
any way. Please let the sender know and delete the message immediately.
-----------------------------------------------------------------


Reply via email to