Hi Stefano, You have probably seen https://ci.apache.org/projects/flink/flink-docs-release-1.0/setup/config.html#kerberos ?
Currently, all nodes need to be authenticated with the Kerberos before Flink is started (not just the JobManager). Could it be that the start-cluster.sh script actually is not authenticated using Kerberos at the nodes it sshs to when it starts the TaskManagers? Best, Max On Fri, Mar 11, 2016 at 8:17 AM, Stefano Baghino <stefano.bagh...@radicalbit.io> wrote: > Hello everybody, > > me and my colleagues have been running some tests on Flink 1.0.0 in a secure > environment (Kerberos). Yesterday we did several tests on the standalone > Flink deployment but couldn't get it to access HDFS. Judging from the error > it looks like Flink is not trying to authenticate itself with Kerberos. The > root cause of the error is > "org.apache.hadoop.security.AccessControlException: SIMPLE authentication is > not enabled. Available:[TOKEN, KERBEROS]". I've put the whole logs in this > gist. I've went through the source code and judging from what I saw this > error is emitted by Hadoop if a client is not using any authentication > method on a secure cluster. Also, in the source code of Flink, it looks like > when running a job on a secure cluster a log message (at INFO level) should > be printed stating the fact. > > To go through the steps I followed to setup the environment: I've built > Flink and put it in the same folder under the two nodes of the cluster, > adjusted the configs, assigned its ownership (and write permissions) to a > group, than I ran kinit with a user belonging to that group on both the > nodes and finally I ran start-cluster.sh and deployed the job. I tried both > running the job as the same user who ran the start-cluster.sh script and > another one (still authenticated with Kerberos on both nodes). > > The core-site.xml correctly states that the authentication method is > kerberos and using the hdfs CLI everything runs as expected. Thinking it > could be an error tied to permissions on the core-site.xml file I also added > the user running the start-cluster.sh script to the hadoop group, which > owned the file, yield the same results, unfortunately. > > Can you help me troubleshoot this issue? Thank you so much in advance! > > -- > BR, > Stefano Baghino > > Software Engineer @ Radicalbit