Many thanks Aljoscha! I am sorry I missed this section. Regards, Kedar
On Mon, Mar 12, 2018 at 9:16 AM, Aljoscha Krettek <aljos...@apache.org> wrote: > Hi Kedar, > > There is this section in the Flink docs: https://ci.apache.org/ > projects/flink/flink-docs-master/monitoring/debugging_classloading.html > > Best, > Aljoscha > > > On 10. Mar 2018, at 05:53, kedar mhaswade <kedar.mhasw...@gmail.com> > wrote: > > This is an interesting question and it usually has consequences that are > far-reaching in user experience. > > If a Flink app is supposed to be a "standalone app" that any Flink > installation should be able to run, then the child-first classloading makes > sense. This is how we build many of the Java application servers (e.g. > GlassFish, JBoss etc). Doing this makes the application "self-contained" > and perhaps portable. Of course, this increases the size of the Jar. The > one issue to watch out for is application using framework classes that are > newer than framework itself. For instance, should I expect my app with > Flink *1.6* DataSet/DataStream classes to run smoothly on a Flink 1.5 > installation? > > If a Flink app depends on a particular (version of the) Flink > installation, then, if using parent-first classloading, the app can make > use of the classes that the installation itself uses. This makes the app > (comparatively) less self-contained, but this limits the size of the app's > Jar. There are advantages of doing this, but it poses problems especially > in upgrades. > > Whether one or the other should be the behavior largely depends on how the > applications are built, tested, and deployed. Application's build comes > into picture because in tools like Maven a dependency can be declared to be > "provided" which means if you know that your app's dependency is also your > framework's (i.e. Flink) dependency and you, as an app developer, are okay > with that Maven wouldn't bundle it in your app's Jar. > > So, my recommendation is that since this appears like a backward > incompatible change, Flink should provide an option to go back to > parent-first classloading for a given app, at least for 1.5. Child-first > classloading seems like the right thing to do given how (unnecessarily) > complicated the deployments have become and given how frequently apps use > library versions that are different from the framework. > > ElasticSearch solution has merits too, but it is unclear if it helps *at > deployment time* merely to identify that there is a duplicate (without > knowing where it has come from). Ideally, when people build the so-called > shadow Jar (one Jar with all dependencies) the build script should warn of > the duplicates. Shadow Jars alleviate (but do not remove) the problems of > "Jar Hell". But it seems to me that till we move to a modular Java (that is > Java 9; I think this is way out in future), this is the preferred solution. > > That said, I'd really like to see a classloading section in Flink docs > (somewhere in dev/best_practices.html). Is a JIRA in order? > > Regards, > Kedar > > On Fri, Mar 9, 2018 at 1:52 PM, Stephan Ewen <ewenstep...@gmail.com> > wrote: > >> @Ken very interesting thought. >> >> One for have three options: >> - forbid duplicate classes >> - parent first conflict resolution >> - child first conflict resolution >> >> Having number one as the default and let the error message suggest >> options two and three as options would definitely make users aware of the >> issue... >> >> On Fri, Mar 9, 2018, 21:09 Ken Krugler <kkrugler_li...@transpac.com> >> wrote: >> >>> I can’t believe I’m suggesting this, but perhaps the Elasticsearch >>> “Hammer of Thor” (aka “jar hell”) approach would be appropriate here. >>> >>> Basically they prevent a program from running if there are duplicate >>> classes on the classpath. >>> >>> This causes headaches when you really need a different version of >>> library X, and that’s already on the class path. >>> >>> See https://github.com/elastic/elasticsearch/issues/14348 for an >>> example of the issues it can cause. >>> >>> But it definitely catches a lot of oops-ish mistakes in building the >>> jars, and makes debugging easier (they print out “class X jar1: <path to >>> jar> jar2: <path to jar>”). >>> >>> Caused by: java.lang.IllegalStateException: jar hell! >>> class: jdk.packager.services.UserJvmOptionsService >>> jar1: >>> /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/ant-javafx.jar >>> jar2: >>> /Library/Java/JavaVirtualMachines/jdk1.8.0_66.jdk/Contents/Home/lib/packager.jar >>> >>> — Ken >>> >>> >>> On Mar 9, 2018, at 3:21 AM, Stephan Ewen <se...@apache.org> wrote: >>> >>> Hi all! >>> >>> Flink 1.4 introduces child-first classloading by default, for the >>> application libraries. >>> >>> We added that, because it allows applications to use different versions >>> of many libraries, compared to what Flink uses in its core, or compared to >>> what other dependencies (like Hadoop) pull into the class path. >>> >>> For example, applications can use different versions of akka, Avro, >>> Protobuf, etc. Compared to what Flink / Hadoop / etc. uses. >>> >>> Now, while that is nice, child-first classloading runs into trouble when >>> the application jars are not properly built, meaning when the application >>> JAR contains libraries that it should not (because they are already in the >>> classpath / lib folder). >>> >>> For example, when the class path has the Kafka Connector (connector is >>> in the lib directory) and the application jar also contains Kafka, the we >>> get nasty errors due to class duplication and impossible class casts (X >>> cannot be cast to X). >>> >>> >>> What I would like to understand is how this change worked out for the >>> users. Based on that, we can keep this or revert this change in the next >>> release. >>> >>> Please answer to this mail with: >>> >>> a. This was a great change, keep it and polish it. >>> >>> b. This caused in the end more problems than it solved, so please set >>> the default back to "parent-first" in 1.5 and leave "child-first" as an >>> optional flag. >>> >>> >>> Thanks a lot, >>> Stephan >>> >>> >>> -------------------------------------------- >>> http://about.me/kkrugler >>> +1 530-210-6378 <(530)%20210-6378> >>> >>> > >