Yes, we've seen this issue as well, though it usually takes many more resubmits before the error pops up. Interestingly, of the 7 jobs we run (all of which use different Avro schemas), we only see this issue on 1 of them. Once the NoClassDefFoundError crops up though, it is necessary to recreate the task managers.
There's a whole page on the Flink documentation on debugging classloading, and Avro is mentioned several times on that page: https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/debugging_classloading.html It seems that (in 1.3 at least) each submitted job has its own classloader, and its own instance of the Avro class definitions. However, the Avro class cache will keep references to the Avro classes from classloaders for the previous cancelled jobs. That said, we haven't been able to find a solution to this error yet. Flink 1.4 would be worth a try because the of the changes to the default classloading behaviour (child-first is the new default, not parent-first). -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/