Hi Leon,
You are welcome. ‘Each plugin is loaded through its own classloader’(see 
doc<https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/filesystems/plugins/>)
 and as a result, they are not added to the flink system classpath. If I 
understand correctly, you do not need to do extra work if you set them 
correctly in the flink-conf.yaml.
If you have some dependency jars for a specific flink job, since 1.15.0, you 
can put those jars under ‘usrlib’ (if the dir does not exist, you can create it 
by yourself) which will be shipped automatically as well.

Best,
Biao Geng

From: Leon Xu <l...@attentivemobile.com>
Date: Sunday, June 5, 2022 at 4:04 PM
To: Biao Geng <biaoge...@gmail.com>
Cc: user <user@flink.apache.org>
Subject: Re: Questions regarding classpath loading order in 
YarnClusterDescriptor

Hi Biao,

I really appreciate your thorough answers. And yes for now I took the 
workaround by manipulating the directory names.
To follow up with one more question if you don't mind:
What is the recommended way of managing plugins in YarnClusterDescriptor? 
Currently I am placing the plugins (e.g. flink-s3-fs-hadoop) under the system 
jars setting, which works. But I am also seeing this comment in the 
code<https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L908>
 so I am a bit confused.


Thanks
Leon

On Sat, Jun 4, 2022 at 11:03 PM Biao Geng 
<biaoge...@gmail.com<mailto:biaoge...@gmail.com>> wrote:
Hi Leon,

For your question1, in the classpath, there are 2 types of jars: user jars and 
flink system jars(i.e. jars in flink/lib). System jars are sorted 
alphabetically. For user jars, there are 3 choices to add user jars in the 
final classpath: ORDER, FIRST, LAST(See the 
doc<https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/resource-providers/yarn/#user-jars--classpath>
 for more details). To my best knowledge, there is no way to pass a sort 
function for this for now. One workaround is managing your jar paths. You can 
put the jar that you want to load first in an alphabetical smaller 
directory(e.g a-flink/user-jar).
For your question2, flink-dist.jar is always at the end of the system jars. 
Depending on your choices of adding user jars, it is not always at the end of 
the final generated classpath. flink-dist.jar is special and mandatory as we 
need it to launch java process to run ClusterEntrypoint on the cluster side. 
Other jars in the flink/lib can somehow be compromised.

I have met a similar problem as well. My previous woraround is managing the 
directory name, which is not so elegant. It can be useful to add the ability to 
customize loading orders of jars in classpath while it is also important to 
package the jars more carefully to avoid the conflicts.

Best,
Biao Geng


Leon Xu <l...@attentivemobile.com<mailto:l...@attentivemobile.com>> 
于2022年6月5日周日 03:21写道:
Hi Flink Community,

We are building on top of  org.apache.flink.yarn.YarnClusterDescriptor to 
submit a flink application from Java code to YARN cluster, in the application 
mode. We are setting the classpath as the value of the yarn.provided.lib.dirs 
property under the yarn configuration.

By playing with the YarnClusterDescriptor code I have two questions that I hope 
to get some answers:
1. YarnClusterDescriptor seems to force the classpath loading in alphabetical 
order. See code 
here<https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnClusterDescriptor.java#L966>.
 Is there any specific reason for doing that? If I'd like to enforce my own 
order is it possible now?
2. Looks like the flink-dist.jar is treated separately from the other classpath 
classes. In the YarnApplicationFileUploader class, the 
registerMultipleLocalResources method will skip the jar if it is a dist jar. 
See the code 
here<https://github.com/apache/flink/blob/master/flink-yarn/src/main/java/org/apache/flink/yarn/YarnApplicationFileUploader.java#L283>.
 With the current behavior it seems it will always place the flink-dist.jar at 
the end of the classpath. Is there any reason that Flink wants to treat the 
flink-dist.jar separately from other jars?

In our classpath loading we are hoping to enforce certain order because 
different jars may contain the same dependent library but with different 
versions. We hope to force the order so that we can load the correct library 
version as we want.


Thanks
Leon

Reply via email to