Hi, I was reading that I should avoid using dynamic classloading and so copy the job's jar into the /lib directory (RE: below)
1. How can I confirm that the jar was copied over? I only see the following below: 2017-11-20 15:36:52,724 INFO org.apache.flink.yarn.Utils - Copying from file:/local/data/scratch/tmp/p2epdlsu/chregi/flink-1.2.0/lib to hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/lib 2017-11-20 15:37:04,644 INFO org.apache.flink.yarn.Utils - Copying from file:/local/data/scratch/tmp/p2epdlsu/chregi/flink-1.2.0/lib/flink-dist_2.10-1.2.0.jar to hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/flink-dist_2.10-1.2.0.jar 2017-11-20 15:37:06,634 INFO org.apache.flink.yarn.Utils - Copying from /home/p2epdlsu/datalake-cdc-prod/etc/flink/conf/flink-conf.yaml to hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/flink-conf.yaml 2. I also saw this ticket https://issues.apache.org/jira/browse/FLINK-4913 and was wondering whether this is orthogonal to the dynamic loading and having to put my jar in the lib directory. Or should this be handled by default already. 3. I start flink from a globally mounted/shared copy that I don't have write access to. I can't easily put jars in that lib folder. For the same reason I shouldn't modify the global copy of the bin/config.sh. Is there a way to configure where flink picks up the lib folder from? Thanks! Avoiding Dynamic Classloading All components (JobManger, TaskManager, Client, ApplicationMaster, ...) log their classpath setting on startup. They can be found as part of the environment information at the beginning of the log. When running a setup where the Flink JobManager and TaskManagers are exclusive to one particular job, one can put JAR files directly into the /lib folder to make sure they are part of the classpath and not loaded dynamically. It usually works to put the job's JAR file into the /lib directory. The JAR will be part of both the classpath (the AppClassLoader) and the dynamic class loader (FlinkUserCodeClassLoader). Because the AppClassLoader is the parent of the FlinkUserCodeClassLoader (and Java loads parent-first), this should result in classes being loaded only once. For setups where the job's JAR file cannot be put to the /lib folder (for example because the setup is a session that is used by multiple jobs), it may still be possible to put common libraries to the /lib folder, and avoid dynamic class loading for those. Regina Chan Goldman Sachs - Enterprise Platforms, Data Architecture 30 Hudson Street, 37th floor | Jersey City, NY 07302 * (212) 902-5697