Re: Dockerised Flink 1.8 with Hadoop S3 FS support

Yang Wang Fri, 03 Jul 2020 01:29:06 -0700

Hi Lorenzo,

Since Flink 1.8 does not support plugin mechanism to load filesystem, you
need to copy flink-s3-fs-hadoop-*.jar
from opt to lib directory.


The dockerfile could be like following.

FROM flink:1.8-scala_2.11
RUN cp /opt/flink/opt/flink-s3-fs-hadoop-*.jar /opt/flink/lib

Then build you docker image and start the session cluster again.


Best,
Yang


Lorenzo Nicora <lorenzo.nic...@gmail.com> 于2020年7月2日周四 下午6:05写道：

> Hi
>
> I need to set up a dockerized *session cluster* using Flink *1.8.2* for
> development and troubleshooting. We are bound to 1.8.2 as we are deploying
> to AWS Kinesis Data Analytics for Flink.
>
> I am using an image based on the semi-official flink:1.8-scala_2.11
> I need to add to my dockerized cluster support for S3 Hadoop File System
> (s3a://), we have on KDA out of the box.
>
> Note I do not want to add dependencies to the job directly, as I want to
> deploy locally exactly the same JAR I deploy to KDA.
>
> Flink 1.8 docs [1] say  is supported out of the box but does not look to
> be the case for dockerised version.
> I am getting "Could not find a file system implementation for scheme
> 's3a'" and "Hadoop is not in the classpath/dependencies".
> I assume I need to create a customised docker image,
> extending flink:1.8-scala_2.11, but I do not understand how to add support
> for S3 Hadoop FS.
>
> Can someone please point me in the right direction? Docs or examples?
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/filesystems.html
>
>
> Lorenzo
>

Re: Dockerised Flink 1.8 with Hadoop S3 FS support

Reply via email to