Shouldn't the config key be : org.apache.hadoop.fs.s3.S3FileSystem
Cheers On Fri, Aug 11, 2017 at 5:38 PM, ant burton <apburto...@gmail.com> wrote: > Hello, > > After following the instructions to set the S3 filesystem in the > documentation (https://ci.apache.org/projects/flink/flink-docs- > release-1.3/setup/aws.html#set-s3-filesystem) I encountered the following > error: > > No file system found with scheme s3, referenced in file URI > 's3://<bucket>/<endpoint>'. > > The documentation goes on to say “If your job submission fails with an > Exception message noting that No file system found with scheme s3 this > means that no FileSystem has been configured for S3. Please check out the > FileSystem > Configuration section > <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/aws.html#set-s3-filesystem> > for > details on how to configure this properly." > > After checking over the configuration the error persisted. My > configuration is as follows. > > I am using the docker image flink:1.3.1, with command: local > > # flink --version > Version: 1.3.1, Commit ID: 1ca6e5b > > # cat flink/config/flink-conf.yaml | head -n1 > fs.hdfs.hadoopconf: /root/hadoop-config > > The rest of the content of flink-conf.yaml is identical to the release > version. > > The following was added to /root/hadoop-config/core-site.xml, > I understand this is used internally by flink as configuration > for “org.apache.hadoop.fs.s3a.S3AFileSystem” > > I’ve removed my AWS access key and secret for obvious reasons, they are > present in the actual file ;-) > > # cat /root/hadoop-config/core-site.xml > <configuration> > <property> > <name>fs.s3.impl</name> > <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value> > </property> > > <property> > <name>fs.s3a.buffer.dir</name> > <value>/tmp</value> > </property> > > <property> > <name>fs.s3a.access.key</name> > <value>MY_ACCESS_KEY</value> > </property> > > <property> > <name>fs.s3a.secret.key</name> > <value>MY_SECRET_KEY</value> > </property> > </configuration> > > The JAR’s aws-java-sdk-1.7.4.jar, hadoop-aws-2.7.4.jar, > httpclient-4.2.5.jar, httpcore-4.2.5.jar where added to flink/lib/ from > http://apache.mirror.anlx.net/hadoop/common/hadoop- > 2.7.4/hadoop-2.7.4.tar.gz > > # ls flink/lib/ > aws-java-sdk-1.7.4.jar > flink-dist_2.11-1.3.1.jar > flink-python_2.11-1.3.1.jar > flink-shaded-hadoop2-uber-1.3.1.jar > hadoop-aws-2.7.4.jar > httpclient-4.2.5.jar > httpcore-4.2.5.jar > log4j-1.2.17.jar > slf4j-log4j12-1.7.7.jar > > I’m using the streaming api, with the following example: > > // Set StreamExecutionEnvironment > final StreamExecutionEnvironment env = StreamExecutionEnvironment. > getExecutionEnvironment(); > > // Set checkpoints in ms > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); > > // Add source (input stream) > DataStream<String> dataStream = StreamUtil.getDataStream(env, params); > > // Sink to S3 Bucket > dataStream.writeAsText("s3://test-flink/test.txt").setParallelism(1); > > pom.xml has the following build dependencies. > > <dependency> > <groupId>org.apache.flink</groupId> > <artifactId>flink-connector-filesystem_2.10</artifactId> > <version>1.3.1</version> > </dependency> > > <dependency> > <groupId>org.apache.hadoop</groupId> > <artifactId>hadoop-aws</artifactId> > <version>2.7.2</version> > </dependency> > > > Would anybody be able to spare some time to help me resolve my problem? > I'm sure I’m missing something simple here. > > Thanks :-) >