Re: hadoopcompatibility not in dist

Flavio Pompermaier Fri, 28 Apr 2017 02:19:19 -0700

I faced this problem yesterday and putting flink-hadoop-compatibility under
flink/lib folder solved the problem for me.
But what is the official recommendation? Should I put it into lib or opt
folder?
Is there any difference from a class-loading point of view?


Best,
Flavio

On Fri, Apr 7, 2017 at 10:54 PM, Petr Novotnik <
petr.novot...@firma.seznam.cz> wrote:

> Hey Fabi,
>
> many thanks for your clarifications! It seems flink-shaded-hadoop2
> itself is already included in the binary distribution:
>
> > $ jar tf flink-1.2.0/lib/flink-dist_2.10-1.2.0.jar | grep
> org/apache/hadoop | head -n3
> > org/apache/hadoop/
> > org/apache/hadoop/fs/
> > org/apache/hadoop/fs/FileSystem$Statistics$StatisticsAggregator.class
>
> That's why adding just the hadoop-compatibility jar fixed the problem
> for me. I'm not at all into how flink handles class loading yet, but at
> the first look into `TypeExtractor` I was surprised to see it _not_
> using  the thread's current context class loader [1] (with a fallback to
> its own classloader). This has led me to investigating the jars'
> contents and find the problem. I'll set up a jira ticket for this issue
> on Monday.
>
> Have a nice weekend,
> P.
>
> [1]
> > http://stackoverflow.com/questions/1771679/difference-
> between-threads-context-class-loader-and-normal-classloader
>
>
>
> On 04/07/2017 09:24 PM, Fabian Hueske wrote:
> > Hi Petr,
> >
> > I think that's an expected behavior because the exception is intercepted
> > and enriched with an instruction to solve the problem.
> > As you assumed, you need to add the flink-hadoop-compatibility JAR file
> > to the ./lib folder. Unfortunately, the file is not included in the
> > binary distribution.
> > You can either build it from source or manually download it from a
> > public Maven repository. You might need to add the flink-shaded-hadoop2
> > jar file as well, which is a dependency of flink-hadoop-compatibility.
> >
> > I think we should make that easier for users and add a pre-built jar
> > file to the ./opt folder of the binary distribution.
> > Would you mind to open a JIRA for this?
> >
> > Now a bit of background why we moved the TypeInfo to
> > flink-hadoop-compatibility. We are preparing Flink's core to become
> > independent of Hadoop, i.e., Flink core should not require Hadoop. We
> > will of course keep the option to run Flink on YARN and write data to
> > HDFS, but this should be optional and not baked into the core.
> >
> > Best, Fabian
> >
> >
> >
> > 2017-04-07 16:27 GMT+02:00 Petr Novotnik <petr.novot...@firma.seznam.cz
> > <mailto:petr.novot...@firma.seznam.cz>>:
> >
> >     Hello,
> >
> >     with 1.2.0 `WritableTypeInfo` got moved into its own artifact
> >     (flink-hadoop-compatibility_2.10-1.2.0.jar). Unlike with 1.1.0, the
> >     distribution jar `flink-dist_2.10-1.2.0.jar` does not include the
> hadoop
> >     compatibility classes anymore. However, `TypeExtractor` which is
> part of
> >     the distribution jar tries to load `WritableTypeInfo` using it was
> >     loaded itself from:
> >
> >     >               Class<?> typeInfoClass;
> >     >               try {
> >     >                       typeInfoClass =
> >     Class.forName(HADOOP_WRITABLE_TYPEINFO_CLASS, false,
> >     TypeExtractor.class.getClassLoader());
> >     >               }
> >     >               catch (ClassNotFoundException e) {
> >     >                       throw new RuntimeException("Could not load
> >     the TypeInformation for the class '"
> >     >                                       + HADOOP_WRITABLE_CLASS +
> >     "'. You may be missing the 'flink-hadoop-compatibility'
> dependency.");
> >     >               }
> >
> >     Adding `flink-hadoop-compatibility` to my application jar leads to
> the
> >     following stack trace on yarn (running `bin/flink run -m
> >     yarn-cluster...`):
> >
> >     > Caused by: java.lang.RuntimeException: Could not load the
> >     TypeInformation for the class 'org.apache.hadoop.io
> >     <http://org.apache.hadoop.io>.Writable'. You may be missing the
> >     'flink-hadoop-compatibility' dependency.
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createHadoopWritableTypeInfo(TypeExtractor.java:2025)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> privateGetForClass(TypeExtractor.java:1649)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> privateGetForClass(TypeExtractor.java:1591)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createTypeInfoWithTypeHierarchy(TypeExtractor.java:778)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createSubTypesInfo(TypeExtractor.java:998)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createTypeInfoWithTypeHierarchy(TypeExtractor.java:679)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> privateCreateTypeInfo(TypeExtractor.java:629)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createTypeInfo(TypeExtractor.java:595)
> >     >       at
> >     org.apache.flink.api.java.typeutils.TypeExtractor.
> createTypeInfo(TypeExtractor.java:588)
> >     >       at
> >     org.apache.flink.api.common.typeinfo.TypeHint.<init>(
> TypeHint.java:47)
> >     >       at
> >     cz.seznam.euphoria.benchmarks.flink.Util$2.<init>(Util.java:80)
> >
> >     I guess I'm supposed to customize my flink installation by adding the
> >     hadoop-compatibility jar to flink's `lib` dir, correct? If so, is
> this
> >     documented? I couldn't find any hints on [1] nor [2] and, thus,
> suppose
> >     this is maybe an unintentional change between 1.1 and 1.2.
> >
> >     [1]
> >     https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/batch/hadoop_compatibility.html
> >     <https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/batch/hadoop_compatibility.html>
> >     [2]
> >     https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/migration.html
> >     <https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/migration.html>
> >
> >     P.
> >
> >
> >
>
>

Re: hadoopcompatibility not in dist

Reply via email to