@Elias This is a know issue that will be fixed in 1.4.2 which we will do very quickly just because of this bug: https://issues.apache.org/jira/browse/FLINK-8741 <https://issues.apache.org/jira/browse/FLINK-8741>.
> On 23. Feb 2018, at 05:53, Elias Levy <fearsome.lucid...@gmail.com> wrote: > > Something seems to be off with the user code class loader. The only way I > can get my job to start is if I drop the job into the lib folder in the JM > and configure the JM's classloader.resolve-order to parent-first. > > Suggestions? > > On Thu, Feb 22, 2018 at 12:52 PM, Elias Levy <fearsome.lucid...@gmail.com > <mailto:fearsome.lucid...@gmail.com>> wrote: > I am currently suffering through similar issues. > > Had a job running happily, but when it the cluster tried to restarted it > would not find the JSON serializer in it. The job kept trying to restart in a > loop. > > Just today I was running a job I built locally. The job ran fine. I added > two commits and rebuilt the jar. Now the job dies when it tries to start > claiming it can't find the time assigner class. I've unzipped the job jar, > both locally and in the TM blob directory and have confirmed the class is in > it. > > This is the backtrace: > > java.lang.ClassNotFoundException: com.foo.flink.common.util.TimeAssigner > at java.net.URLClassLoader.findClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source) > at java.lang.ClassLoader.loadClass(Unknown Source) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Unknown Source) > at > org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:73) > at java.io.ObjectInputStream.readNonProxyDesc(Unknown Source) > at java.io.ObjectInputStream.readClassDesc(Unknown Source) > at java.io.ObjectInputStream.readOrdinaryObject(Unknown Source) > at java.io.ObjectInputStream.readObject0(Unknown Source) > at java.io.ObjectInputStream.readObject(Unknown Source) > at > org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:393) > at > org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:380) > at > org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:368) > at > org.apache.flink.util.SerializedValue.deserializeValue(SerializedValue.java:58) > at > org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.createPartitionStateHolders(AbstractFetcher.java:542) > at > org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.<init>(AbstractFetcher.java:167) > at > org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.<init>(Kafka09Fetcher.java:89) > at > org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher.<init>(Kafka010Fetcher.java:62) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer010.createFetcher(FlinkKafkaConsumer010.java:203) > at > org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:564) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:86) > at > org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55) > at > org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:94) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:264) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > at java.lang.Thread.run(Unknown Source) > > On Tue, Jan 23, 2018 at 7:51 AM, Stephan Ewen <se...@apache.org > <mailto:se...@apache.org>> wrote: > Hi! > > We changed a few things between 1.3 and 1.4 concerning Avro. One of the main > things is that Avro is no longer part of the core Flink class library, but > needs to be packaged into your application jar file. > > The class loading / caching issues of 1.3 with respect to Avro should > disappear in Flink 1.4, because Avro classes and caches are scoped to the job > classloaders, so the caches do not go across different jobs, or even > different operators. > > > Please check: Make sure you have Avro as a dependency in your jar file (in > scope "compile"). > > Hope that solves the issue. > > Stephan > > > On Mon, Jan 22, 2018 at 2:31 PM, Edward <egb...@hotmail.com > <mailto:egb...@hotmail.com>> wrote: > Yes, we've seen this issue as well, though it usually takes many more > resubmits before the error pops up. Interestingly, of the 7 jobs we run (all > of which use different Avro schemas), we only see this issue on 1 of them. > Once the NoClassDefFoundError crops up though, it is necessary to recreate > the task managers. > > There's a whole page on the Flink documentation on debugging classloading, > and Avro is mentioned several times on that page: > https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/debugging_classloading.html > > <https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/debugging_classloading.html> > > It seems that (in 1.3 at least) each submitted job has its own classloader, > and its own instance of the Avro class definitions. However, the Avro class > cache will keep references to the Avro classes from classloaders for the > previous cancelled jobs. That said, we haven't been able to find a solution > to this error yet. Flink 1.4 would be worth a try because the of the changes > to the default classloading behaviour (child-first is the new default, not > parent-first). > > > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> > > >