Github user EronWright commented on the issue:

    https://github.com/apache/flink/pull/3776
  
    @Rucongzhang thanks for the contribution.  I think I understand the problem 
and your solution, which I will recap.  I also found YARN-2704 to be useful 
background.
    
    *Problem*:
    1. YARN log aggregation depends on an HDFS delegation token, which it 
obtains from container token storage not from the UGI.  In keytab mode, the 
Flink client doesn't upload any delegation tokens, causing log aggregation to 
fail.
    2. The Flink cluster doesn't renew delegation tokens.  Note: Flink does 
renew _Kerberos tickets_ using the keytab.
    3. When the UGI contains both a delegation token and a Kerberos ticket, the 
delegation token is preferred.   After expiration, Flink does not fallback to 
using the ticket.
    
    *Solution*:
    1. Change Flink client to upload delegation tokens.  Addresses problem 1.
    2 Change Flink cluster to filter out the HDFS delegation token from the 
tokens loaded from storage when populating the UGI.  Addresses problem 3.
    3 Change JM to propagate its stored tokens to the TM, rather than the 
tokens from the UGI (which were filtered in (2).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to