Hello everybody, me and my colleagues have been running some tests on Flink 1.0.0 in a secure environment (Kerberos). Yesterday we did several tests on the standalone Flink deployment but couldn't get it to access HDFS. Judging from the error it looks like Flink is not trying to authenticate itself with Kerberos. The root cause of the error is "org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]". I've put the whole logs in this gist <https://gist.github.com/stefanobaghino/6f3a877f11bd4a853a84>. I've went through the source code and judging from what I saw this error is emitted by Hadoop if a client is not using any authentication method on a secure cluster. Also, in the source code of Flink, it looks like when running a job on a secure cluster a log message (at INFO level) should be printed stating the fact.
To go through the steps I followed to setup the environment: I've built Flink and put it in the same folder under the two nodes of the cluster, adjusted the configs, assigned its ownership (and write permissions) to a group, than I ran kinit with a user belonging to that group on both the nodes and finally I ran start-cluster.sh and deployed the job. I tried both running the job as the same user who ran the start-cluster.sh script and another one (still authenticated with Kerberos on both nodes). The core-site.xml correctly states that the authentication method is kerberos and using the hdfs CLI everything runs as expected. Thinking it could be an error tied to permissions on the core-site.xml file I also added the user running the start-cluster.sh script to the hadoop group, which owned the file, yield the same results, unfortunately. Can you help me troubleshoot this issue? Thank you so much in advance! -- BR, Stefano Baghino Software Engineer @ Radicalbit