Hi, I am running this on a Hadoop-free cluster (i.e. no YARN etc.). I have the following dependencies packaged in my user application JAR:
aws-java-sdk 1.7.4 flink-hadoop-fs 1.4.0 flink-shaded-hadoop2 1.4.0 flink-connector-filesystem_2.11 1.4.0 hadoop-common 2.7.4 hadoop-aws 2.7.4 I have also tried the following conf: classloader.resolve-order: parent-first fs.hdfs.hadoopconf: /srv/hadoop/hadoop-2.7.5/etc/hadoop But no luck. Anything else I could be missing? On 2018/03/14 18:57:47, Francesco Ciuci <francesco.ci...@gmail.com> wrote: > Hi, > > You do not just need the hadoop dependencies in the jar but you need to > have the hadoop file system running in your machine/cluster. > > Regards > > On 14 March 2018 at 18:38, l...@lyft.com <l...@lyft.com> wrote: > > > I'm trying to use a BucketingSink to write files to S3 in my Flink job. > > > > I have the Hadoop dependencies I need packaged in my user application jar. > > However, on running the job I get the following error (from the > > taskmanager): > > > > java.lang.RuntimeException: Error while creating FileSystem when > > initializing the state of the BucketingSink. > > at org.apache.flink.streaming.connectors.fs.bucketing. > > BucketingSink.initializeState(BucketingSink.java:358) > > at org.apache.flink.streaming.util.functions. > > StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178) > > at org.apache.flink.streaming.util.functions. > > StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java: > > 160) > > at org.apache.flink.streaming.api.operators. > > AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator. > > java:96) > > at org.apache.flink.streaming.api.operators. > > AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259) > > at org.apache.flink.streaming.runtime.tasks.StreamTask. > > initializeOperators(StreamTask.java:694) > > at org.apache.flink.streaming.runtime.tasks.StreamTask. > > initializeState(StreamTask.java:682) > > at org.apache.flink.streaming.runtime.tasks.StreamTask. > > invoke(StreamTask.java:253) > > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718) > > at java.lang.Thread.run(Thread.java:748) > > Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: > > Could not find a file system implementation for scheme 's3a'. The scheme is > > not directly supported by Flink and no Hadoop file system to support this > > scheme could be loaded. > > at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem( > > FileSystem.java:405) > > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320) > > at org.apache.flink.streaming.connectors.fs.bucketing. > > BucketingSink.createHadoopFileSystem(BucketingSink.java:1125) > > at org.apache.flink.streaming.connectors.fs.bucketing. > > BucketingSink.initFileSystem(BucketingSink.java:411) > > at org.apache.flink.streaming.connectors.fs.bucketing. > > BucketingSink.initializeState(BucketingSink.java:355) > > ... 9 common frames omitted > > Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: > > Hadoop is not in the classpath/dependencies. > > at org.apache.flink.core.fs.UnsupportedSchemeFactory.create( > > UnsupportedSchemeFactory.java:64) > > at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem( > > FileSystem.java:401) > > ... 13 common frames omitted > > > > What's the right way to do this? > > >