Hi community,

We are working on secure Flink on YARN. The current Flink-Yarn-Kerberos
integration will require each container of a job to log in Kerberos via
keytab every say, 24 hours, and does not use any Hadoop delegation token
mechanism except when localizing the container. As I fixed the current
Flink-Yarn-Kerberos (FLINK-8275) and tried to add more
features(FLINK-7860), I have some concern regarding the current
implementation. It can pose a scalability issue to the KDC, e.g., if YARN
cluster is restarted and all 10s of thousands of containers suddenly DDOS
KDC.

I would like to propose to improve the current Flink-YARN-Kerberos
integration by doing something like the following:
1) AppMaster (JobManager) periodically authenticate the KDC, get all
required DTs for the job.
2) all other TM or TE containers periodically retrieve new DTs from the
AppMaster (either through a secure HDFS folder, or a secure Akka channel)

Also, we want to extend Flink to support pluggable AuthN mechanism, because
we have our own internal AuthN mechanism. We would like add support in
Flink to authenticate periodically to our internal AuthN service as well
through, e.g., dynamic class loading, and use similar mechanism to
distribute the credential from the appMaster to containers.

I would like to get comments and feedbacks. I can also write a design doc
or create a Flip if needed. Thanks a lot.

Shuyi



-- 
"So you have to trust that the dots will somehow connect in your future."

Reply via email to