For what it's worth, we believe are able to work around this issue by adding the following line to our flink-conf.yaml:
classloader.parent-first-patterns.additional: javax.xml.;org.apache.xerces. On Thu, Dec 6, 2018 at 2:28 AM Chesnay Schepler <ches...@apache.org> wrote: > Small correction: Flink 1.7 does not support jdk9; we only fixed some of > the issues, not all of them. > > On 06.12.2018 07:13, Mike Mintz wrote: > > Hi Flink developers, > > We're running some new DataStream jobs on Flink 1.7.0 using the shaded > Hadoop S3 file system, and running into frequent errors saving checkpoints > and savepoints to S3. I'm not sure what the underlying reason for the error > is, but we often fail with the following stack trace, which appears to be > due to missing the javax.xml.bind.DatatypeConverterImpl class in an > error-handling path for AmazonS3Client. > > java.lang.NoClassDefFoundError: Could not initialize class > javax.xml.bind.DatatypeConverterImpl > at > javax.xml.bind.DatatypeConverter.initConverter(DatatypeConverter.java:140) > at > javax.xml.bind.DatatypeConverter.printBase64Binary(DatatypeConverter.java:611) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.util.Base64.encodeAsString(Base64.java:62) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.util.Md5Utils.md5AsBase64(Md5Utils.java:104) > at > org.apache.flink.fs.s3base.shaded.com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1647) > at > org.apache.flink.fs.shaded.hadoop3.org.apache.hadoop.fs.s3a.S3AFileSystem.putObjectDirect(S3AFileSystem.java:1531) > > I uploaded the full stack trace at > https://gist.github.com/mikemintz/4769fc7bc3320c84ac97061e951041a0 > > For reference, we're running flink from the "Apache 1.7.0 Flink only Scala > 2.11" binary tgz, we've copied flink-s3-fs-hadoop-1.7.0.jar from opt/ to > lib/, we're not defining HADOOP_CLASSPATH, and we're running java 8 > (openjdk version "1.8.0_191") on Ubuntu 18.04 x86_64. > > Presumably there are two issues: 1) some periodic error with S3, and 2) > some classpath / class loading issue with > javax.xml.bind.DatatypeConverterImpl that's preventing the original error > from being displayed. I'm more curious about the later issue. > > This is super puzzling since javax/xml/bind/DatatypeConverterImpl.class is > included in our rt.jar, and lsof confirms we're reading that rt.jar, so I > suspect it's something tricky with custom class loaders or the way the > shaded S3 jar works. Note that this class is not included in > flink-s3-fs-hadoop-1.7.0.jar (which we are using), but it is included in > flink-shaded-hadoop2-uber-1.7.0.jar (which we are not using). > > Another thing that jumped out to us was that Flink 1.7 is now able to > build JDK9, but Java 9 includes deprecation of the javax.xml.bind > libraries, requiring explicit inclusion in a Java 9 module [0]. And we saw > that direct references to javax.xml.bind were removed from flink-core for > 1.7 [1] > > Some things we tried, without success: > > - Building flink from source on a computer with java 8 installed. We > still got NoClassDefFoundError. > - Using the binary version of Flink on machines with java 9 installed. > We get NullPointerException in ClosureCleaner. > - Downloading the jaxb-api jar [2], which has > javax/xml/bind/DatatypeConverterImpl.class, and setting HADOOP_CLASSPATH to > have that jar. We still got NoClassDefFoundError. > - Using iptables to completely block S3 traffic, hoping this would > make it easier to reproduce. The connection errors are properly displayed, > so these connection errors must go down another error handling path. > > Would love to hear any ideas about what might be happening, or further > ideas we can try. > > Thanks! > Mike > > [0] > http://cr.openjdk.java.net/~iris/se/9/java-se-9-fr-spec/#APIs-proposed-for-removal > > [1] https://github.com/apache/flink/pull/6801 > > [2] https://mvnrepository.com/artifact/javax.xml.bind/jaxb-api/2.3.1 > > >