Hi Natu, Something you could try is removing the packaged parquet format and defining a custom format[1]. For this custom format you can then fix the dependencies by packaging all of the following into the format:
* flink-sql-parquet * flink-shaded-hadoop-2-uber * hadoop-aws * aws-java-sdk-bundle * guava This isn't entirely straight-forward, unfortunately, and I haven't verified it. However, with Ververica Platform 2.6, to be released shortly after Flink 1.15, it should also work again. [1] https://docs.ververica.com/user_guide/sql_development/connectors.html#custom-connectors-and-formats Best Ingo On Tue, Dec 7, 2021 at 6:23 AM Natu Lauchande <nlaucha...@gmail.com> wrote: > Hey Timo and Flink community, > > I wonder if there is a fix for this issue. The last time I rollbacked to > version 12 of Flink and downgraded Ververica. > > I am really keen to leverage the new features on the latest versions of > Ververica 2.5+ , i have tried a myriad of tricks suggested ( example : > building the image with hadoop-client libraries) : > > java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration > at java.lang.Class.getDeclaredConstructors0(Native Method) > at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) > at java.lang.Class.getDeclaredConstructors(Class.java:2020) > at java.io.ObjectStreamClass.computeDefaultSUID(ObjectStreamClass > .java:1961) > at java.io.ObjectStreamClass.access$100(ObjectStreamClass.java:79) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:275) > at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:273) > at java.security.AccessController.doPrivileged(Native Method) > at java.io.ObjectStreamClass.getSerialVersionUID(ObjectStreamClass > .java:272) > at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:694) > at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java: > 2003) > at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1850 > ) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2160) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: > 2405) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java: > 2329) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2187) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: > 2405) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java: > 2329) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2187) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: > 2405) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java: > 2329) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2187) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: > 2405) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java: > 2329) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2187) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java: > 2405) > at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java: > 2329) > at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream > .java:2187) > at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1667) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:503) > at java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) > at org.apache.flink.util.InstantiationUtil.deserializeObject( > InstantiationUtil.java:615) > at org.apache.flink.util.InstantiationUtil.deserializeObject( > InstantiationUtil.java:600) > at org.apache.flink.util.InstantiationUtil.deserializeObject( > InstantiationUtil.java:587) > at org.apache.flink.util.InstantiationUtil.readObjectFromConfig( > InstantiationUtil.java:541) > at org.apache.flink.streaming.api.graph.StreamConfig > .getStreamOperatorFactory(StreamConfig.java:322) > at org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>( > OperatorChain.java:159) > at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore( > StreamTask.java:551) > at org.apache.flink.streaming.runtime.tasks.StreamTask > .runWithCleanUpOnFail(StreamTask.java:650) > at org.apache.flink.streaming.runtime.tasks.StreamTask.restore( > StreamTask.java:540) > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf. > Configuration > at java.net.URLClassLoader.findClass(URLClassLoader.java:387) > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) > at org.apache.flink.util.FlinkUserCodeClassLoader > .loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:64) > at org.apache.flink.util.ChildFirstClassLoader > .loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:65) > at org.apache.flink.util.FlinkUserCodeClassLoader.loadClass( > FlinkUserCodeClassLoader.java:48) > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) > ... 48 more > > This error occurs when writing to StreamFileWriting on S3 in Parquet > format. > > > Thanks, > Natu > > On Thu, Jul 22, 2021 at 3:53 PM Timo Walther <twal...@apache.org> wrote: > >> Thanks, this should definitely work with the pre-packaged connectors of >> Ververica platform. >> >> I guess we have to investigate what is going on. Until then, a >> workaround could be to add Hadoop manually and set the HADOOP_CLASSPATH >> environment variable. The root cause seems that Hadoop cannot be found. >> >> Alternatively, you could also build a custom image and include Hadoop in >> the lib folder of Flink: >> >> https://docs.ververica.com/v1.3/platform/installation/custom_images.html >> >> I hope this helps. I will get back to you if we have a fix ready. >> >> Regards, >> Timo >> >> >> >> On 22.07.21 14:30, Natu Lauchande wrote: >> > Sure. >> > >> > That's how the ddl table looks like: >> > >> > CREATETABLEtablea ( >> > >> > `a` BIGINT, >> > >> > `b` BIGINT, >> > >> > `c` BIGINT >> > >> > ) >> > >> > COMMENT '' >> > >> > WITH( >> > >> > 'auto-compaction'='false', >> > >> > 'connector'='filesystem', >> > >> > 'format'='parquet', >> > >> > 'parquet.block.size'='134217728', >> > >> > 'parquet.compression'='SNAPPY', >> > >> > 'parquet.dictionary.page.size'='1048576', >> > >> > 'parquet.enable.dictionary'='true', >> > >> > 'parquet.page.size'='1048576', >> > >> > 'parquet.writer.max-padding'='2097152', >> > >> > 'path'='s3a://test/test’, >> > >> > 'sink.partition-commit.delay'='1 h', >> > >> > 'sink.partition-commit.policy.kind'='success-file', >> > >> > 'sink.partition-commit.success-file.name >> > <http://sink.partition-commit.success-file.name>'='_SUCCESS', >> > >> > 'sink.partition-commit.trigger'='process-time', >> > >> > 'sink.rolling-policy.check-interval'='20 min', >> > >> > 'sink.rolling-policy.file-size'='128MB', >> > >> > 'sink.rolling-policy.rollover-interval'='2 h' >> > >> > ); >> > >> > >> > >> > When a change the connector to a blackhole it immediately works without >> > errors. I have the redacted the names and paths. >> > >> > >> > >> > Thanks, >> > Natu >> > >> > >> > On Thu, Jul 22, 2021 at 2:24 PM Timo Walther <twal...@apache.org >> > <mailto:twal...@apache.org>> wrote: >> > >> > Maybe you can share also which connector/format you are using? What >> is >> > the DDL? >> > >> > Regards, >> > Timo >> > >> > >> > On 22.07.21 14:11, Natu Lauchande wrote: >> > > Hey Timo, >> > > >> > > Thanks for the reply. >> > > >> > > No custom file as we are using Flink SQL and submitting the job >> > directly >> > > through the SQL Editor UI. We are using Flink 1.13.1 as the >> > supported >> > > flink version. No custom code all through Flink SQL on UI no >> jars. >> > > >> > > Thanks, >> > > Natu >> > > >> > > On Thu, Jul 22, 2021 at 2:08 PM Timo Walther <twal...@apache.org >> > <mailto:twal...@apache.org> >> > > <mailto:twal...@apache.org <mailto:twal...@apache.org>>> wrote: >> > > >> > > Hi Natu, >> > > >> > > Ververica Platform 2.5 has updated the bundled Hadoop version >> > but this >> > > should not result in a NoClassDefFoundError exception. How >> > are you >> > > submitting your SQL jobs? You don't use Ververica's SQL >> > service but >> > > have >> > > built a regular JAR file, right? If this is the case, can you >> > share >> > > your >> > > pom.xml file with us? The Flink version stays constant at >> 1.12? >> > > >> > > Regards, >> > > Timo >> > > >> > > On 22.07.21 12:22, Natu Lauchande wrote: >> > > > Good day Flink community, >> > > > >> > > > Apache Flink/Ververica Community Edition - Question >> > > > >> > > > >> > > > I am having an issue with my Flink SQL jobs since updating >> > > from Flink >> > > > 1.12/Ververica 2.4 to Ververica 2.5 . For all the jobs >> > running on >> > > > parquet and S3 i am getting the following error >> continuously: >> > > > >> > > > INITIALIZING to FAILED on 10.243.3.0:42337-2a3224 @ >> > > > 10-243-3-0.flink-metrics.vvp-jobs.svc.cluster.local >> > (dataPort=39309). >> > > > >> > > > java.lang.NoClassDefFoundError: >> > org/apache/hadoop/conf/Configuration >> > > > >> > > > at java.lang.Class.getDeclaredConstructors0(Native Method) >> > > ~[?:1.8.0_292] >> > > > >> > > > at >> > java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) >> > > > ~[?:1.8.0_292] >> > > > >> > > > at >> java.lang.Class.getDeclaredConstructors(Class.java:2020) >> > > ~[?:1.8.0_292] >> > > > >> > > > *....* >> > > > >> > > > at >> > java.io.ObjectInputStream.readObject(ObjectInputStream.java:461) >> > > > ~[?:1.8.0_292] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:615) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:600) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:587) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.InstantiationUtil.readObjectFromConfig(InstantiationUtil.java:541) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:322) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperator(OperatorChain.java:653) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:626) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.OperatorChain.<init>(OperatorChain.java:181) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:548) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759) >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > org.apache.flink.runtime.taskmanager.Task.run(Task.java:566) >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_292] >> > > > >> > > > Caused by: java.lang.ClassNotFoundException: >> > > > org.apache.hadoop.conf.Configuration >> > > > >> > > > at >> java.net.URLClassLoader.findClass(URLClassLoader.java:382) >> > > ~[?:1.8.0_292] >> > > > >> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:418) >> > > ~[?:1.8.0_292] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.FlinkUserCodeClassLoader.loadClassWithoutExceptionHandling(FlinkUserCodeClassLoader.java:64) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.ChildFirstClassLoader.loadClassWithoutExceptionHandling(ChildFirstClassLoader.java:65) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at >> > > > >> > > >> > >> >> org.apache.flink.util.FlinkUserCodeClassLoader.loadClass(FlinkUserCodeClassLoader.java:48) >> > > >> > > > ~[flink-dist_2.12-1.13.1-stream1.jar:1.13.1-stream1[] >> > > > >> > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:351) >> > > ~[?:1.8.0_292] >> > > > >> > > > ... 57 more >> > > > >> > > > 2021-07-22 09:38:43,095 DEBUG >> > > > org.apache.flink.runtime.scheduler.SharedSlot[] - Remove >> > logical >> > > slot >> > > > (SlotRequestId{4297879e795d0516e36a7c26ccc795b2}) for >> > execution >> > > vertex >> > > > (id cbc357ccb763df2852fee8c4fc7d55f2_0) from the physical >> slot >> > > > (SlotRequestId{df7c49a6610b56f26aea214c05bcd9ed}) >> > > > >> > > > 2021-07-22 09:38:43,096 DEBUG >> > > > org.apache.flink.runtime.scheduler.SharedSlot[] - Release >> > shared >> > > slot >> > > > externally >> (SlotRequestId{df7c49a6610b56f26aea214c05bcd9ed}) >> > > > >> > > > >> > > > Everything works well when i roll back to Ververica v2.4, >> > has anyone >> > > > experienced this error before. >> > > > >> > > > Thanks, >> > > > >> > > > Natu >> > > > >> > > >> > >> >>