[ https://issues.apache.org/jira/browse/FLINK-24897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17445631#comment-17445631 ]
Biao Geng commented on FLINK-24897: ----------------------------------- Hi [~trohrmann] and [~wangyang0918] thank you very much for your reply. I agree with Till's suggestion about reusing the existing logic to include {{usrlib}} in user classloader. Yang's questions are also helpful and critical: *A summary of my answer abourt {{{}usrlib{}}}:* 0. We should ship {{usrlib}} by default like what we have done for {{lib}} dir. 1. We should avoid uploading it again and not add classes in it into system path if users specify {{usrlib}} again in the {{yarn.ship-files}} option. 2. It should work for per-job mode 3. Only when UserJarInclusion is DISABLED will {{usrlib}} take effect in per-job mode. But we should consider the default value of {{UserJarInclusion}} option. *Datail:* Q1: Currently, I think we should ship {{usrlib}} by default if it exists because AFAIK, {{usrlib}} is the default userClassPath which is defined by flink. If we ask the user to explicitly specify it, it is somehow waste the flink's contract with users. When users specify a shipped directory named as "usrlib", I think there are 3 options: Option1: skip it Option2: report error Option3: do nothing but just upload it and add files in {{usrlib}} into system classpaths Option1 seems to be easiest, just as what we have done for {{flink_dist.jar}} when users specify {{lib}} in ship files. Option3 is worthwhile to mention as if users specify {{usrlib}} in ship files, files in {{usrlib}} will be added into system classpaths but if users use child-first resolve order, files in {{usrlib}} will also be loaded by UserClassLoader as they are in userClassPath as well. Bad things happen If users choose parent-first resolve order, files in {{usrlib}} will be loaded by AppClassLoader which breaks the design. So, in summary, I think skipping it is a better one. Q2: After checking codes about {{FileJobGraphRetriever}} and {{{}YarnJobClusterEntrypoint{}}}, I think we have prepared for using {{usrlib}} if we upload it to the cluster. Q3: I agree only when UserJarInclusion is DISABLED will {{usrlib}} take effect in per-job mode. But currently default value of UserJarInclusion is {{ORDERED}} and works for all 3 modes(per job, session, app). If we agree the {{usrlib}} should be shipped automatically, we may need to consider the default value of this option if we want to use UserClassLoader to load jars in {{{}usrlib{}}}. > Enable application mode on YARN to use usrlib > --------------------------------------------- > > Key: FLINK-24897 > URL: https://issues.apache.org/jira/browse/FLINK-24897 > Project: Flink > Issue Type: Improvement > Components: Deployment / YARN > Reporter: Biao Geng > Priority: Major > > Hi there, > I am working to utilize application mode to submit flink jobs to YARN cluster > but I find that currently there is no easy way to ship my user-defined > jars(e.g. some custom connectors or udf jars that would be shared by some > jobs) and ask the FlinkUserCodeClassLoader to load classes in these jars. > I checked some relevant jiras, like FLINK-21289. In k8s mode, there is a > solution that users can use `usrlib` directory to store their user-defined > jars and these jars would be loaded by FlinkUserCodeClassLoader when the job > is executed on JM/TM. > But on YARN mode, `usrlib` does not work as that: > In this method(org.apache.flink.yarn.YarnClusterDescriptor#addShipFiles), if > I want to use `yarn.ship-files` to ship `usrlib` from my flink client(in my > local machine) to remote cluster, I must not set UserJarInclusion to > DISABLED due to the checkArgument(). However, if I do not set that option to > DISABLED, the user jars to be shipped will be added into systemClassPaths. As > a result, classes in those user jars will be loaded by AppClassLoader. > But if I do not ship these jars, there is no convenient way to utilize these > jars in my flink run command. Currently, all I can do seems to use `-C` > option, which means I have to upload my jars to some shared store first and > then use these remote paths. It is not so perfect as we have already make it > possible to ship jars or files directly and we also introduce `usrlib` in > application mode on YARN. It would be more user-friendly if we can allow > shipping `usrlib` from local to remote cluster while using > FlinkUserCodeClassLoader to load classes in the jars in `usrlib`. > -- This message was sent by Atlassian Jira (v8.20.1#820001)