Hi, I am using flink single-job mode on YARN to read data from a kafka cluster installation configured for Kerberos. When i upgrade flink to 1.4.0 , the yarn application can not run normally and logs th error like this:
Exception in thread "main" java.lang.RuntimeException: org.apache.flink.configuration.IllegalConfigurationException: Kerberos login configuration is invalid; keytab is unreadable at org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:160) at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65) at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala) Caused by: org.apache.flink.configuration.IllegalConfigurationException: Kerberos login configuration is invalid; keytab is unreadable at org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:139) at org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90) at org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71) at org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139 So i add some logs for the method "SecurityConfiguration.validate()" and rebuild the flink package. private void validate() { if (!StringUtils.isBlank(keytab)) { // principal is required if (StringUtils.isBlank(principal)) { throw new IllegalConfigurationException("Kerberos login configuration is invalid; keytab requires a principal."); } // check the keytab is readable File keytabFile = new File(keytab); if (!keytabFile.exists()) { throw new IllegalConfigurationException("WTF! keytabFile is not exist ! keytab:" + keytab); } if (!keytabFile.isFile()) { throw new IllegalConfigurationException("WTF! keytabFile is not file ! keytab:" + keytab); } if (!keytabFile.canRead()) { throw new IllegalConfigurationException("WTF! keytabFile is not readalbe ! keytab:" + keytab); } if (!keytabFile.exists() || !keytabFile.isFile() || !keytabFile.canRead()) { throw new IllegalConfigurationException("Kerberos login configuration is invalid; keytab is unreadable"); } } } After that , the yarn logs error like this : 017-12-15 17:14:36,314 INFO org.apache.flink.yarn.YarnTaskManagerRunner - localKeytabPath: /data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000002/krb5.keytab 2017-12-15 17:14:36,315 INFO org.apache.flink.yarn.YarnTaskManagerRunner - YARN daemon is running as: hadoop Yarn client user obtainer: hadoop 2017-12-15 17:14:36,315 INFO org.apache.flink.yarn.YarnTaskManagerRunner - ResourceID assigned for this container: container_e05_1513310528578_0009_01_000002 2017-12-15 17:14:36,321 ERROR org.apache.flink.yarn.YarnTaskManagerRunner - Exception occurred while launching Task Manager org.apache.flink.configuration.IllegalConfigurationException: WTF! keytabFile is not exist ! keytab:/data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000001/krb5.keytab at org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:140) at org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90) at org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71) at org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139) at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65) at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala) These logs tell the "keytabFile" value is different from the "localKeytabPath". I searched the "org.apache.flink.yarn.YarnTaskManagerRunner" class source code and found there are something different betwee 1.3.2 and 1.4.0 1.3.2 //To support Yarn Secure Integration Test Scenario File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME); if (krb5Conf.exists() && krb5Conf.canRead()) { String krb5Path = krb5Conf.getAbsolutePath(); LOG.info("KRB5 Conf: {}", krb5Path); hadoopConfiguration = new org.apache.hadoop.conf.Configuration(); hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION, "kerberos"); hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, "true"); } // set keytab principal and replace path with the local path of the shipped keytab file in NodeManager if (localKeytabPath != null && remoteKeytabPrincipal != null) { configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB, localKeytabPath); configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL, remoteKeytabPrincipal); } 1.4.0 //To support Yarn Secure Integration Test Scenario File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME); if (krb5Conf.exists() && krb5Conf.canRead()) { String krb5Path = krb5Conf.getAbsolutePath(); LOG.info("KRB5 Conf: {}", krb5Path); org.apache.hadoop.conf.Configuration hadoopConfiguration = new org.apache.hadoop.conf.Configuration(); hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION, "kerberos"); hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION, "true"); // set keytab principal and replace path with the local path of the shipped keytab file in NodeManager if (localKeytabPath != null && remoteKeytabPrincipal != null) { configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB, localKeytabPath); configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL, remoteKeytabPrincipal); } sc = new SecurityConfiguration(configuration, Collections.singletonList(securityConfig -> new HadoopModule(securityConfig, hadoopConfiguration))); } else { sc = new SecurityConfiguration(configuration); } In the previous version ,the "SecurityOptions.KERBEROS_LOGIN_KEYTAB" is always set the same with "localKeytabPath" but in 1.4.0 only if the "krb5Conf.exists() && krb5Conf.canRead()" retrun true . And in my test case ,it looks like the code only run the else default code。 Are there something i counld do to work around this problem ? Thanks!