Yes, we've seen this issue as well, though it usually takes many more
resubmits before the error pops up. Interestingly, of the 7 jobs we run (all
of which use different Avro schemas), we only see this issue on 1 of them.
Once the NoClassDefFoundError crops up though, it is necessary to recreate
the task managers.

There's a whole page on the Flink documentation on debugging classloading,
and Avro is mentioned several times on that page:
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/debugging_classloading.html

It seems that (in 1.3 at least) each submitted job has its own classloader,
and its own instance of the Avro class definitions. However, the Avro class
cache will keep references to the Avro classes from classloaders for the
previous cancelled jobs. That said, we haven't been able to find a solution
to this error yet. Flink 1.4 would be worth a try because the of the changes
to the default classloading behaviour (child-first is the new default, not
parent-first).





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to