Please see my answers inline. Hope provided satisfying answers to all
questions.
G
On Thu, Feb 3, 2022 at 9:17 AM Chesnay Schepler<ches...@apache.org>
<mailto:ches...@apache.org> wrote:
I have a few question that I'd appreciate if you could answer them.
1. How does the Provider know whether it is required or not?
All registered providers which are registered properly are going to be
loaded and asked to obtain tokens. Worth to mention every provider
has the right to decide whether it wants to obtain tokens or not (bool
delegationTokensRequired()). For instance if provider detects that
HBase is not on classpath or not configured properly then no tokens are
obtained from that specific provider.
You may ask how a provider is registered. Here it is:
The provider is on classpath + there is a META-INF file which contains the
name of the provider, for example:
META-INF/services/org.apache.flink.runtime.security.token.DelegationTokenProvider
<https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-b65ee7e64c5d2dfbb683d3569fc3e42f4b5a8052ab83d7ac21de5ab72f428e0b>
<https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-b65ee7e64c5d2dfbb683d3569fc3e42f4b5a8052ab83d7ac21de5ab72f428e0b>
1. How does the configuration of Providers work (how do they get
access to a configuration)?
Flink configuration is going to be passed to all providers. Please see the
POC here:
https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1
Service specific configurations are loaded on-the-fly. For example in HBase
case it looks for HBase configuration class which will be instantiated
within the provider.
1. How does a user select providers? (Is it purely based on the
provider being on the classpath?)
Providers can be explicitly turned off with the following config:
"security.kerberos.tokens.${name}.enabled". I've never seen that 2
different implementation would exist for a specific
external service, but if this edge case would exist then the mentioned
config need to be added, a new provider with a different name need to be
implemented and registered.
All in all we've seen that provider handling is not user specific task but
a cluster admin one. If a specific provider is needed then it's implemented
once per company, registered once
to the clusters and then all users may or may not use the obtained tokens.
Worth to mention the system will know which token need to be used when HDFS
is accessed, this part is automatic.
1. How can a user override an existing provider?
Pease see the previous bulletpoint.
1. What is DelegationTokenProvider#name() used for?
By default all providers which are registered properly (on classpath +
META-INF entry) are on by default. With
"security.kerberos.tokens.${name}.enabled" a specific provider can be
turned off.
Additionally I'm intended to use this in log entries later on for debugging
purposes. For example "hadoopfs provider obtained 2 tokens with ID...".
This would help what and when is happening
with tokens. The same applies to TaskManager side: "2 hadoopfs provider
tokens arrived with ID...". Important to note that the secret part will be
hidden in the mentioned log entries to keep the
attach surface low.
1. What happens if the names of 2 providers are identical?
Presume you mean 2 different classes which both registered and having the
same logic inside. This case both will be loaded and both is going to
obtain token(s) for the same service.
Both obtained token(s) are going to be added to the UGI. As a result the
second will overwrite the first but the order is not defined. Since both
token(s) are valid no matter which one is
used then access to the external system will work.
When the class names are same then service loader only loads a single entry
because services are singletons. That's the reason why state inside
providers are not advised.
1. Will we directly load the provider, or first load a factory
(usually preferable)?
Intended to load a provider directly by DTM. We can add an extra layer to
have factory but after consideration I came to a conclusion that it would
be and overkill this case.
Please have a look how it's planned to load providers now:
https://github.com/apache/flink/compare/master...gaborgsomogyi:dt?expand=1#diff-d56a0bc77335ff23c0318f6dec1872e7b19b1a9ef6d10fff8fbaab9aecac94faR54-R81
1. What is the Credentials class (it would necessarily have to be a
public api as well)?
Credentials class is coming from Hadoop. My main intention was not to bind
the implementation to Hadoop completely. It is not possible because of the
following reasons:
* Several functionalities are must because there are no alternatives,
including but not limited to login from keytab, proper TGT cache handling,
passing tokens to Hadoop services like HDFS, HBase, Hive, etc.
* The partial win is that the whole delegation token framework is going to
be initiated if hadoop-common is on classpath (Hadoop is optional in core
libraries)
The possibility to eliminate Credentials from API could be:
* to convert Credentials to byte array forth and back while a provider
gives back token(s): I think this would be an overkill and would make the
API less clear what to give back what Manager understands
* to re-implement Credentials internal structure in a POJO, here the same
convert forth and back would happen between provider and manager. I think
this case would be the re-invent the wheel scenario
1. What does the TaskManager do with the received token?
Puts the tokens into the UserGroupInformation instance for the current
user. Such way Hadoop compatible services can pick up the tokens from there
properly.
This is an existing pattern inside Spark.
1. Is there any functionality in the TaskManager that could require a
token on startup (i.e., before registering with the RM)?
Never seen such functionality in Spark and after analysis not seen in
Flink too. If you have something in mind which I've missed plz help me out.
On 11/01/2022 14:58, Gabor Somogyi wrote:
Hi All,
Hope all of you have enjoyed the holiday season.
I would like to start the discussion on
FLIP-211<https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework>
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework>
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework>
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework>
which
aims to provide a
Kerberos delegation token framework that /obtains/renews/distributes tokens
out-of-the-box.
Please be aware that the FLIP wiki area is not fully done since the
discussion may
change the feature in major ways. The proposal can be found in a google doc
here<https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ>
<https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ>
<https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ>
<https://docs.google.com/document/d/1JzMbQ1pCJsLVz8yHrCxroYMRP2GwGwvacLrGyaIx5Yc/edit?fbclid=IwAR0vfeJvAbEUSzHQAAJfnWTaX46L6o7LyXhMfBUCcPrNi-uXNgoOaI8PMDQ>
.
As the community agrees on the approach the content will be moved to the
wiki page.
Feel free to add your thoughts to make this feature better!
BR,
G