Hi,
I am using flink single-job mode on YARN to read data from a kafka
cluster installation configured for Kerberos. When i upgrade flink to
1.4.0 , the yarn application can not run normally and logs th error
like this:
Exception in thread "main" java.lang.RuntimeException:
org.apache.flink.configuration.IllegalConfigurationException: Kerberos
login configuration is invalid; keytab is unreadable
at
org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:160)
at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65)
at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala)
Caused by: org.apache.flink.configuration.IllegalConfigurationException:
Kerberos login configuration is invalid; keytab is unreadable
at
org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:139)
at
org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90)
at
org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71)
at
org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139
So i add some logs for the method "SecurityConfiguration.validate()"
and rebuild the flink package.
private void validate() {
if (!StringUtils.isBlank(keytab)) {
// principal is required
if (StringUtils.isBlank(principal)) {
throw new IllegalConfigurationException("Kerberos login
configuration is invalid; keytab requires a principal.");
}
// check the keytab is readable
File keytabFile = new File(keytab);
if (!keytabFile.exists()) {
throw new IllegalConfigurationException("WTF! keytabFile is
not exist ! keytab:" + keytab);
}
if (!keytabFile.isFile()) {
throw new IllegalConfigurationException("WTF! keytabFile is
not file ! keytab:" + keytab);
}
if (!keytabFile.canRead()) {
throw new IllegalConfigurationException("WTF! keytabFile is
not readalbe ! keytab:" + keytab);
}
if (!keytabFile.exists() || !keytabFile.isFile() ||
!keytabFile.canRead()) {
throw new IllegalConfigurationException("Kerberos login
configuration is invalid; keytab is unreadable");
}
}
}
After that , the yarn logs error like this :
017-12-15 17:14:36,314 INFO
org.apache.flink.yarn.YarnTaskManagerRunner -
localKeytabPath:
/data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000002/krb5.keytab
2017-12-15 17:14:36,315 INFO
org.apache.flink.yarn.YarnTaskManagerRunner - YARN
daemon is running as: hadoop Yarn client user obtainer: hadoop
2017-12-15 17:14:36,315 INFO
org.apache.flink.yarn.YarnTaskManagerRunner -
ResourceID assigned for this container:
container_e05_1513310528578_0009_01_000002
2017-12-15 17:14:36,321 ERROR
org.apache.flink.yarn.YarnTaskManagerRunner -
Exception occurred while launching Task Manager
org.apache.flink.configuration.IllegalConfigurationException: WTF!
keytabFile is not exist !
keytab:/data1/yarn/nm/usercache/hadoop/appcache/application_1513310528578_0009/container_e05_1513310528578_0009_01_000001/krb5.keytab
at
org.apache.flink.runtime.security.SecurityConfiguration.validate(SecurityConfiguration.java:140)
at
org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:90)
at
org.apache.flink.runtime.security.SecurityConfiguration.<init>(SecurityConfiguration.java:71)
at
org.apache.flink.yarn.YarnTaskManagerRunner.runYarnTaskManager(YarnTaskManagerRunner.java:139)
at org.apache.flink.yarn.YarnTaskManager$.main(YarnTaskManager.scala:65)
at org.apache.flink.yarn.YarnTaskManager.main(YarnTaskManager.scala)
These logs tell the "keytabFile" value is different from the
"localKeytabPath". I searched the
"org.apache.flink.yarn.YarnTaskManagerRunner" class source code and
found there are
something different betwee 1.3.2 and 1.4.0
1.3.2
//To support Yarn Secure Integration Test Scenario
File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME);
if (krb5Conf.exists() && krb5Conf.canRead()) {
String krb5Path = krb5Conf.getAbsolutePath();
LOG.info("KRB5 Conf: {}", krb5Path);
hadoopConfiguration = new org.apache.hadoop.conf.Configuration();
hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
"kerberos");
hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
"true");
}
// set keytab principal and replace path with the local path of the
shipped keytab file in NodeManager
if (localKeytabPath != null && remoteKeytabPrincipal != null) {
configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB,
localKeytabPath);
configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL,
remoteKeytabPrincipal);
}
1.4.0
//To support Yarn Secure Integration Test Scenario
File krb5Conf = new File(currDir, Utils.KRB5_FILE_NAME);
if (krb5Conf.exists() && krb5Conf.canRead()) {
String krb5Path = krb5Conf.getAbsolutePath();
LOG.info("KRB5 Conf: {}", krb5Path);
org.apache.hadoop.conf.Configuration hadoopConfiguration = new
org.apache.hadoop.conf.Configuration();
hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHENTICATION,
"kerberos");
hadoopConfiguration.set(CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION,
"true");
// set keytab principal and replace path with the local path of the
shipped keytab file in NodeManager
if (localKeytabPath != null && remoteKeytabPrincipal != null) {
configuration.setString(SecurityOptions.KERBEROS_LOGIN_KEYTAB,
localKeytabPath);
configuration.setString(SecurityOptions.KERBEROS_LOGIN_PRINCIPAL,
remoteKeytabPrincipal);
}
sc = new SecurityConfiguration(configuration,
Collections.singletonList(securityConfig -> new
HadoopModule(securityConfig, hadoopConfiguration)));
} else {
sc = new SecurityConfiguration(configuration);
}
In the previous version ,the "SecurityOptions.KERBEROS_LOGIN_KEYTAB"
is always set the same with "localKeytabPath" but in 1.4.0 only if the
"krb5Conf.exists() && krb5Conf.canRead()" retrun true . And in my test
case ,it looks like the code only run the else default code。
Are there something i counld do to work around this problem ?
Thanks!