Hi,

I was reading that I should avoid using dynamic classloading and so copy the 
job's jar into the /lib directory (RE: below)


1.     How can I confirm that the jar was copied over? I only see the following 
below:

2017-11-20 15:36:52,724 INFO  org.apache.flink.yarn.Utils                       
            - Copying from 
file:/local/data/scratch/tmp/p2epdlsu/chregi/flink-1.2.0/lib to 
hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/lib
2017-11-20 15:37:04,644 INFO  org.apache.flink.yarn.Utils                       
            - Copying from 
file:/local/data/scratch/tmp/p2epdlsu/chregi/flink-1.2.0/lib/flink-dist_2.10-1.2.0.jar
 to 
hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/flink-dist_2.10-1.2.0.jar
2017-11-20 15:37:06,634 INFO  org.apache.flink.yarn.Utils                       
            - Copying from 
/home/p2epdlsu/datalake-cdc-prod/etc/flink/conf/flink-conf.yaml to 
hdfs://d173636/user/delp_prod/.flink/application_1511197407590_58493/flink-conf.yaml


2.     I also saw this ticket https://issues.apache.org/jira/browse/FLINK-4913 
and was wondering whether this is orthogonal to the dynamic loading and having 
to put my jar in the lib directory. Or should this be handled by default 
already.


3.     I start flink from a globally mounted/shared copy that I don't have 
write access to. I can't easily put jars in that lib folder. For the same 
reason I shouldn't modify the global copy of the bin/config.sh. Is there a way 
to configure where flink picks up the lib folder from?

Thanks!



Avoiding Dynamic Classloading
All components (JobManger, TaskManager, Client, ApplicationMaster, ...) log 
their classpath setting on startup. They can be found as part of the 
environment information at the beginning of the log.
When running a setup where the Flink JobManager and TaskManagers are exclusive 
to one particular job, one can put JAR files directly into the /lib folder to 
make sure they are part of the classpath and not loaded dynamically.
It usually works to put the job's JAR file into the /lib directory. The JAR 
will be part of both the classpath (the AppClassLoader) and the dynamic class 
loader (FlinkUserCodeClassLoader). Because the AppClassLoader is the parent of 
the FlinkUserCodeClassLoader (and Java loads parent-first), this should result 
in classes being loaded only once.
For setups where the job's JAR file cannot be put to the /lib folder (for 
example because the setup is a session that is used by multiple jobs), it may 
still be possible to put common libraries to the /lib folder, and avoid dynamic 
class loading for those.


Regina Chan
Goldman Sachs - Enterprise Platforms, Data Architecture
30 Hudson Street, 37th floor | Jersey City, NY 07302 *  (212) 902-5697

Reply via email to