[ https://issues.apache.org/jira/browse/FLINK-28291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
jiulong.zhu updated FLINK-28291: -------------------------------- Labels: (was: patch) > Add kerberos delegation token renewer feature instead of logged from keytab > individually > ---------------------------------------------------------------------------------------- > > Key: FLINK-28291 > URL: https://issues.apache.org/jira/browse/FLINK-28291 > Project: Flink > Issue Type: New Feature > Components: Deployment / YARN > Affects Versions: 1.13.5 > Reporter: jiulong.zhu > Priority: Minor > Fix For: 1.13.5 > > Attachments: FLINK-28291.0001.patch > > > h2. 1. Design > LifeCycle of delegation token in RM: > # Container starts with DT given by client. > # Enable delegation token renewer by: > ## set {{security.kerberos.token.renew.enabled}} true, default false. And > ## specify {{security.kerberos.login.keytab}} and > {{security.kerberos.login.principal}} > # When enabled delegation token renewer, the renewer thread will re-obtain > tokens from DelegationTokenProvider(only HadoopFSDelegationTokenProvider > now). Then the renewer thread will broadcast new tokens to RM locally, all > JMs and all TMs by RPCGateway. > # RM process adds new tokens in context by UserGroupInformation. > LifeCycle of delegation token in JM / TM: > # TaskManager starts with keytab stored in remote hdfs. > # When registered successfully, JM / TM get the current tokens of RM boxed > by {{JobMasterRegistrationSuccess}} / {{{}TaskExecutorRegistrationSuccess{}}}. > # JM / TM process add new tokens in context by UserGroupInformation. > It’s too heavy and unnecessary to retrieval leader of ResourceManager by > HAService, so DelegationTokenManager is instanced by ResourceManager. So > DelegationToken can hold the reference of ResourceManager, instead of RM > RPCGateway or self gateway. > h2. 2. Test > # No local junit test. It’s too heavy to build junit environments including > KDC and local hadoop. > # Cluster test > step 1: Specify krb5.conf with short token lifetime(ticket_lifetime, > renew_lifetime) when submitting flink application. > ``` > {{flink run .... -yD security.kerberos.token.renew.enabled=true -yD > security.kerberos.krb5-conf.path= /home/work/krb5.conf -yD > security.kerberos.login.use-ticket-cache=false ...}} > ``` > step 2: Watch token identifier changelog and synchronizer between rm and > worker. > >> > In RM / JM log, > 2022-06-28 15:13:03,509 INFO org.apache.flink.runtime.util.HadoopUtils [] - > New token (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) > created in KerberosDelegationToken, and next schedule delay is 64799880 ms. > 2022-06-28 15:13:03,529 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. 2022-06-28 15:13:04,729 INFO > org.apache.flink.runtime.util.HadoopUtils [] - JobMaster receives new token > (HDFS_DELEGATION_TOKEN token 52101 for work on ha-hdfs:newfyyy) from RM. > … > 2022-06-29 09:13:03,732 INFO org.apache.flink.runtime.util.HadoopUtils [] - > New token (HDFS_DELEGATION_TOKEN token 52310 for work on ha-hdfs:newfyyy) > created in KerberosDelegationToken, and next schedule delay is 64800045 ms. > 2022-06-29 09:13:03,805 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. > 2022-06-29 09:13:03,806 INFO org.apache.flink.runtime.util.HadoopUtils [] - > JobMaster receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on > ha-hdfs:newfyyy) from RM. > >> > In TM log, > 2022-06-28 15:13:17,983 INFO org.apache.flink.runtime.util.HadoopUtils [] - > TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52101 for work on > ha-hdfs:newfyyy) from RM. > 2022-06-28 15:13:18,016 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. > … > 2022-06-29 09:13:03,809 INFO org.apache.flink.runtime.util.HadoopUtils [] - > TaskManager receives new token (HDFS_DELEGATION_TOKEN token 52310 for work on > ha-hdfs:newfyyy) from RM. > 2022-06-29 09:13:03,836 INFO org.apache.flink.runtime.util.HadoopUtils [] - > Updating delegation tokens for current user. -- This message was sent by Atlassian Jira (v8.20.10#820010)