Hi Niels, Thank you for your question. Flink relies entirely on the Kerberos support of Hadoop. So your question could also be rephrased to "Does Hadoop support long-term authentication using Kerberos?". And the answer is: Yes!
While Hadoop uses Kerberos tickets to authenticate users with services initially, the authentication process continues differently afterwards. Instead of saving the ticket to authenticate on a later access, Hadoop creates its own security tockens (DelegationToken) that it passes around. These are authenticated to Kerberos periodically. To my knowledge, the tokens have a life span identical to the Kerberos ticket maximum life span. So be sure to set the maximum life span very high for long streaming jobs. The renewal time, on the other hand, is not important because Hadoop abstracts this away using its own security tockens. I'm afraid there is not Kerberos how-to yet. If you are on Yarn, then it is sufficient to authenticate the client with Kerberos. On a Flink standalone cluster you need to ensure that, initially, all nodes are authenticated with Kerberos using the kinit tool. Feel free to ask if you have more questions and let us know about any difficulties. Best regards, Max On Thu, Oct 22, 2015 at 2:06 PM, Niels Basjes <ni...@basjes.nl> wrote: > Hi, > > I want to write a long running (i.e. never stop it) streaming flink > application on a kerberos secured Hadoop/Yarn cluster. My application needs > to do things with files on HDFS and HBase tables on that cluster so having > the correct kerberos tickets is very important. The stream is to be ingested > from Kafka. > > One of the things with Kerberos is that the tickets expire after a > predetermined time. My knowledge about kerberos is very limited so I hope > you guys can help me. > > My question is actually quite simple: Is there an howto somewhere on how to > correctly run a long running flink application with kerberos that includes a > solution for the kerberos ticket timeout ? > > Thanks > > Niels Basjes