Thanks for the FLIP Gabor. Generally looks good, have a couple of comments.
If I understand correctly we plan on pulling the current implementation apart and adding an extra layer of abstraction, to allow arbitrary implementations. I notice that most of the delegation token classes [1] are missing compatibility annotations, although a couple are @Experimental. Can we update these classes to add the compatibility annotations ASAP? The DelegationTokenManager class [2] you talk about adding already exists, without any compatibility guarantees. The recent FLIPs [3][4] do not cover what support un-annotated classes default to, unless I missed it. > External connectors need to look into the TokenContainer and when token is found with a specific key then it must be used Will we implement some form of hook that connectors can subscribe to? Otherwise I feel we will duplicate a bunch of logic across connectors periodically polling this TokenContainer for updates. Thanks, [1] https://github.com/apache/flink/tree/0634d0cc0a401d268cc1b3788c8ee63a91ff33a6/flink-runtime/src/main/java/org/apache/flink/runtime/security/token [2] https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/security/token/DelegationTokenManager.java [3] https://cwiki.apache.org/confluence/display/FLINK/FLIP-196%3A+Source+API+stability+guarantees [4] https://cwiki.apache.org/confluence/display/FLINK/FLIP-197%3A+API+stability+graduation+process On Wed, Nov 9, 2022 at 4:46 PM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > OK, removed the fallback part. > > Thanks for the help! > > G > > > On Wed, Nov 9, 2022 at 5:03 PM Őrhidi Mátyás <matyas.orh...@gmail.com> > wrote: > > > looks better! no further concerns > > > > Cheers, > > Matyas > > > > On Mon, Nov 7, 2022 at 9:21 AM Gabor Somogyi <gabor.g.somo...@gmail.com> > > wrote: > > > > > Oh gosh, copied wrong config keys so fixed my last mail with green. > > > > > > On Mon, Nov 7, 2022 at 6:07 PM Gabor Somogyi < > gabor.g.somo...@gmail.com> > > > wrote: > > > > > > > Hi Matyas, > > > > > > > > In the meantime I was thinking about the per provider re-obtain > feature > > > > and here are my thoughts related that: > > > > * I think it's a good feature in general but as mentioned I would add > > it > > > > in a separate FLIP > > > > * In case of Hadoop providers it just wouldn't work (HBase doesn't > have > > > > end timestamp so actually HDFS is triggering the re-obtain) but in > all > > > > non-hadoop providers it's a good idea > > > > * Adding "security.delegation.tokens.renewal.retry.backoff" and > > > > "security.delegation.tokens.renewal.time-ratio" is needed but as you > > > > mentioned fallback to kerberos configs just doesn't make sense > > > > * In a later FLIP we can add per provider > > > > "security.delegation.token.provider.{providerName}.renewal.retry.backoff" > > > > and/or > > > "security.delegation.token.provider.{providerName}.renewal.time-ratio" > > > > for non-hadoop providers > > > > * This is an additional feature which justifies to separate Hadoop > and > > > > non-hadoop providers on the API level > > > > > > > > Waiting on your opinion. > > > > > > > > G > > > > > > > > > > > > On Mon, Nov 7, 2022 at 4:17 PM Gabor Somogyi < > > gabor.g.somo...@gmail.com> > > > > wrote: > > > > > > > >> Hi Matyas, > > > >> > > > >> Thanks for your comments, answered inline. > > > >> > > > >> G > > > >> > > > >> > > > >> On Mon, Nov 7, 2022 at 2:58 PM Őrhidi Mátyás < > matyas.orh...@gmail.com > > > > > > >> wrote: > > > >> > > > >>> Hi Gabor, > > > >>> > > > >>> Thanks for driving this effort! A few thoughts on the topic: > > > >>> - Could you please add a few examples of the delegation token > > providers > > > >>> we > > > >>> expected to be added in the near future? Ideally these providers > > could > > > be > > > >>> configured independently from each other. However the > configuration > > > >>> defaults mentioned in the FLIP are derived from hadoop > > configuration. I > > > >>> don't see the point here. > > > >>> > > > >> A clear plan is to add S3 now and Kafka possibly later on. > > > >> > > > >> S3 looks straightforward but that doesn't fit into the existing > > > framework. > > > >> On Kafka side I've added the Kafka provider to Spark so I can > imagine > > > >> similar solution w/ minor differences. > > > >> Please see the Spark solution: > > > >> > > > > > > https://spark.apache.org/docs/latest/structured-streaming-kafka-integration.html#security > > > >> Here the minor planned difference is that Spark handles Kafka token > > > >> inside UGI which is not nice but works. > > > >> I've just seen similar generalization effort on the Spark side too > so > > > the > > > >> Kafka part may or may not change there. > > > >> > > > >> Not sure what you mean configs are derived from Hadoop. > > > >> > > > >> What I can think of is that > > > >> "security.delegation.tokens.renewal.retry.backoff" is defaulting to > > > >> "security.kerberos.tokens.renewal.retry.backoff". > > > >> This is basically happening for backward compatibility purposes. > > > >> > > > >> The other thing what I can think of is that you miss the independent > > > >> provider token obtain functionality. > > > >> When I mean independent configuration I mean each provider has it's > > own > > > >> set of keys which doesn't collide > > > >> but the voting system when token obtain must happen remains and not > > > >> touched in this FLIP. > > > >> Under voting system I mean each provider may send back its end > > timestamp > > > >> and the lowest wins > > > >> (at that timestamp all tokens are going to be re-obtained). > > > >> If that's what you mean we can think about solutions but it has > > nothing > > > >> to do with framework generalization. > > > >> > > > >> > > > >>> - Are we planning to support such scenarios where we need to > > read/write > > > >>> from different authentication realms from the same application. Two > > > >>> Hadoop > > > >>> clusters, Kafka clusters etc? This would need an authentication > > > provider > > > >>> per source/sink. > > > >>> > > > >>> > > > >> It doesn't need 2 different provider per source/sink to do Kafka to > > > >> Kafka. Such cases cross-realm trust can be configured w/ principal > > > mapping. > > > >> Please see the details here: > > > >> > > https://gist.github.com/gaborgsomogyi/c636f352ccec7730ff41ac1d524cb87d > > > >> Even if the gist was originally created for Spark I tend to do the > > same > > > >> here in the future. > > > >> > > > >> Just to make an extract here Kafka principal mapping looks like > this: > > > >> sasl.kerberos.principal.to.local.rules=RULE:[1:$1@$0](.*@ > > > >> DT2HOST.COMPANY.COM)s/@.*//,DEFAULT > > > >> > > > >> Providing 2 users in Hadoop world is just impossible because UGI is > > > >> basically a singleton. > > > >> Of course everything can be hacked around (for ex. changing current > > user > > > >> on-the-fly) but that would be such a > > > >> headache what I think we must avoid. It would end-up in > > synchronization > > > >> and performance hell. > > > >> I've made some experiments in my Spark era and this would be the > same > > > >> here :) > > > >> > > > >> > > > >>> Thanks, > > > >>> Matyas > > > >>> > > > >>> > > > >>> > > > >>> On Mon, Nov 7, 2022 at 5:10 AM Gabor Somogyi < > > > gabor.g.somo...@gmail.com> > > > >>> wrote: > > > >>> > > > >>> > Hi team, > > > >>> > > > > >>> > Delegation token framework is going to be finished soon (added in > > > >>> FLIP-211 > > > >>> > < > > > >>> > > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-211%3A+Kerberos+delegation+token+framework?src=contextnavpagetreemode > > > >>> > > > > > >>> > ). > > > >>> > Previously there were concerns that the current implementation is > > > >>> bound to > > > >>> > Hadoop and Kerberos authentication. This is fair concern and as a > > > >>> result > > > >>> > we've created a proposal to generalize the delegation token > > framework > > > >>> > (practically making it authentication agnostic). > > > >>> > > > > >>> > This can open the path to add further non-hadoop and non-Kerberos > > > based > > > >>> > providers like S3 or many others. > > > >>> > > > > >>> > One can find the FLIP in: > > > >>> > - Wiki: > > > >>> > > > > >>> > > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-272%3A+Generalized+delegation+token+support > > > >>> > - document: > > > >>> > > > > >>> > > > > >>> > > > > > > https://docs.google.com/document/d/12tFdx1AZVuW9BjwBht_pMNELgrqro8Z5-hzWeaRY4pc/edit?usp=sharing > > > >>> > > > > >>> > I would like to start a discussion to make the framework better. > > > >>> > > > > >>> > BR, > > > >>> > G > > > >>> > > > > >>> > > > >> > > > > > >