Hi Daniel, I think at least regarding 3 threre is a quick fix in the pom.xml - we need to exclude the hadoop-* artifacts from the shader plugin. I think Robert can confirm whether this is the case.
Regards, Alexander 2015-01-18 18:28 GMT+01:00 Daniel Warneke <warn...@apache.org>: > Hi, > > I just pushed my first version of Flink supporting YARN environments with > security/Kerberos enabled [1]. While working with the current Flink > version, I was really impressed by how easy it is to deploy the software on > a YARN cluster. However, there are a few things a stumbled upon and I would > be interested in your opinion: > > 1. Separation between YARN session and Flink job > Currently, we separate the Flink YARN session from the Flink jobs, i.e. a > user first has to bring up the Flink cluster on YARN through a separate > command and can then submit an arbitrary number of jobs to this cluster. > Through this separation it is possible to submit individual jobs with a > really low latency, but it introduces two major problems: First, it is > currently impossible to programmatically launch a Flink YARN cluster, > submit a job, wait for its completion and then tear the cluster down again > (correct me if I’m wrong here) although this is actually a very important > use case. Second, with the security enabled, all jobs are executed with the > security credentials of the user who launched the Flink cluster. This > causes massive authorization problems. Therefore, I would propose to move > to a model where we launch one Flink cluster per job (or at least to make > this a very prominent option). > > 2. Loading Hadoop configuration settings for Flink > In the current release, we use custom code to identify and load the > relevant Hadoop XML configuration files (e.g. core-site.xml, yarn-site.xml) > for the Flink YARN client. I found this mechanism to be quite fragile as it > depends on certain environment variables to be set and assumes certain > configuration keys to be specified in certain files. For example, with > Hadoop security enabled, the Flink YARN client needs to know what kind of > authentication mechanisms HDFS expects for the data transfer. This setting > is usually specified in hdfs-site.xml. In the current Flink version, the > YARN client ignores this file and hence cannot talk to HDFS when security > is enabled. > As an alternative, I propose to launch the Flink cluster on YARN through > the “yarn jar” command. With this command, you get the entire configuration > setup for free and no longer have to worry about names of configuration > files, configuration paths and environment variables. > > 3. The uberjar deployment model > In my opinion, the current Flink deployment model for YARN, with the one > fat uberjar, is unnecessarily bulky. With the last release the Flink > uberjar has grown to over 100 MB in size, amounting to almost 400 MB of > class files when uncompressed. Many of the includes are not even necessary. > For example, when using the “yarn jar” hook to deploy Flink, all relevant > Hadoop libraries are added to the classpath anyway, so there is no need to > include them in the uberjar (unless you assume the client does not have a > Hadoop environment installed). Personally, I would favor a more > fine-granular deployment model. Especially, when we move to a > one-job-per-session model, I think we should allow having Flink > preinstalled on the cluster nodes and not always require to redistribute > the 100 MB uberjar to each and every node. > > Any thoughts on that? > > Best regards, > > Daniel > > [1] https://github.com/warneke/flink/tree/security >