Hi Brian, thanks, that helped me a lot.
2015-12-15 16:52 GMT+01:00 Brian Chhun <brian.ch...@getbraintree.com>: > Sure, excuse me if anything was obvious or wrong, I know next to nothing > about Hadoop. > > 1. get the Hadoop 2.7 distribution (I set its path to HADOOP_HOME to make > things easier for mysellf) > 2. set the HADOOP_CLASSPATH to include > ${HADOOP_HOME}/share/hadoop/common/*:${HADOOP_HOME}/share/hadoop/tools/lib/* > (you may not need all those paths?) > 3. stick this into $HADOOP_HOME/etc/hadoop/core-site.xml > > <configuration> > <property> > <name>fs.defaultFS</name> > <value>s3a://YOUR-BUCKET</value> > </property> > <property> > <name>fs.s3a.impl</name> > <value>org.apache.hadoop.fs.s3a.S3AFileSystem</value> > </property> > </configuration> > > 4. stick this into your flink-conf > > fs.hdfs.hadoopconf: $HADOOP_HOME/etc/hadoop > recovery.mode: zookeeper > recovery.zookeeper.quorum: whatever01.local:2181 > recovery.zookeeper.path.root: /whatever > state.backend: filesystem > state.backend.fs.checkpointdir: s3a:///YOUR-BUCKET/checkpoints > recovery.zookeeper.storageDir: s3a:///YOUR-BUCKET/recovery > > That's all I had to do in the Flink side. obvs in the AWS side, I had my > IAM role setup with readlwrite access to the bucket. > > Thanks, > Brian > > On Mon, Dec 14, 2015 at 10:39 PM, Thomas Götzinger <m...@simplydevelop.de> > wrote: > >> Hi Brian >> >> Can you give me short summary how to achieve this. >> Am 14.12.2015 23:20 schrieb "Brian Chhun" <brian.ch...@getbraintree.com>: >> >>> For anyone else looking, I was able to use the s3a filesystem which can >>> use IAM role based authentication as provided by the underlying AWS client >>> library. >>> >>> Thanks, >>> Brian >>> >>> On Thu, Dec 10, 2015 at 4:28 PM, Brian Chhun < >>> brian.ch...@getbraintree.com> wrote: >>> >>>> Thanks Ufuk, this did the trick. >>>> >>>> Thanks, >>>> Brian >>>> >>>> On Wed, Dec 9, 2015 at 4:37 PM, Ufuk Celebi <u...@apache.org> wrote: >>>> >>>>> Hey Brian, >>>>> >>>>> did you follow the S3 setup guide? >>>>> https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html >>>>> >>>>> You have to set the fs.hdfs.hadoopconf property and add >>>>> >>>>> <property> >>>>> <name>fs.s3.impl</name> >>>>> <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value> >>>>> </property> >>>>> >>>>> to core-site.xml >>>>> >>>>> – Ufuk >>>>> >>>>> > On 09 Dec 2015, at 20:50, Brian Chhun <brian.ch...@getbraintree.com> >>>>> wrote: >>>>> > >>>>> > Hello, >>>>> > >>>>> > I'm trying to setup an HA cluster and I'm running into issues using >>>>> S3 as the state backend. This is raised during startup: >>>>> > >>>>> > 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system >>>>> found with scheme s3, referenced in file URI 's3:///flink/recovery/blob'. >>>>> > >>>>> > 2015-12-09T19:23:36.430858+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.core.fs.FileSystem.get(FileSystem.java:242) >>>>> > >>>>> > 2015-12-09T19:23:36.430989+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.blob.FileSystemBlobStore.<init>(FileSystemBlobStore.java:67) >>>>> > >>>>> > 2015-12-09T19:23:36.431297+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.blob.BlobServer.<init>(BlobServer.java:105) >>>>> > >>>>> > 2015-12-09T19:23:36.431435+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.createJobManagerComponents(JobManager.scala:1814) >>>>> > >>>>> > 2015-12-09T19:23:36.431569+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1944) >>>>> > >>>>> > 2015-12-09T19:23:36.431690+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.startJobManagerActors(JobManager.scala:1898) >>>>> > >>>>> > 2015-12-09T19:23:36.431810+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.startActorSystemAndJobManagerActors(JobManager.scala:1584) >>>>> > >>>>> > 2015-12-09T19:23:36.431933+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.runJobManager(JobManager.scala:1486) >>>>> > >>>>> > 2015-12-09T19:23:36.432414+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager$.main(JobManager.scala:1447) >>>>> > >>>>> > 2015-12-09T19:23:36.432649+00:00 i-1ec317c4 >>>>> docker/jobmanager01-d3174d6[1207]: #011at >>>>> org.apache.flink.runtime.jobmanager.JobManager.main(JobManager.scala) >>>>> > >>>>> > Is it possible to use S3 as the backend store or is only hdfs/mapfs >>>>> supported? >>>>> > >>>>> > >>>>> > Thanks, >>>>> > Brian >>>>> >>>>> >>>> >>> > -- Viele Grüße Thomas Götzinger Freiberuflicher Informatiker Glockenstraße 2a D-66882 Hütschenhausen OT Spesbach Mobil: +49 (0)176 82180714 Homezone: +49 (0) 6371 735083 Privat: +49 (0) 6371 954050 mailto:m...@simplydevelop.de <thomas.goetzin...@kajukin.de> epost: thomas.goetzin...@epost.de