Re: Struggling with reading the file from s3 as Source

2020-09-14 Thread Vijay Balakrishnan
My problem was the plugins jar needs to be under plugins/s3-fs-hadoop. Running code with Added to flink-conf.yaml: s3.access-key: s3.secret-key: Removed from pom.xml all hadoop dependencies. cd / /bin/start-cluster.sh ./bin/flink runxyz..jar Still struggling with how to get it work with pom.xml

Re: Struggling with reading the file from s3 as Source

2020-09-14 Thread Vijay Balakrishnan
Hi Robert, Thanks for the link. Is there a simple example I can use as a starting template for using S3 with pom.xml ? I copied the flink-s3-fs-hadoop-1.11.1.jar into the plugins/s3-fs-hadoop directory Running from flink-1.11.1/ flink run -cp ../target/monitoring-rules-influx-1.0.jar -jar /Users/v

Re: Struggling with reading the file from s3 as Source

2020-09-11 Thread Robert Metzger
Hi Vijay, Can you post the error you are referring to? Did you properly set up an s3 plugin ( https://ci.apache.org/projects/flink/flink-docs-stable/ops/filesystems/) ? On Fri, Sep 11, 2020 at 8:42 AM Vijay Balakrishnan wrote: > Hi, > > I want to *get data from S3 and process and send to Kinesi

Struggling with reading the file from s3 as Source

2020-09-10 Thread Vijay Balakrishnan
Hi, I want to *get data from S3 and process and send to Kinesis.* 1. Get gzip files from an s3 folder(s3://bucket/prefix) 2. Sort each file 3. Do some map/processing on each record in the file 4. send to Kinesis Idea is: env.readTextFile(s3Folder) .sort(SortFunction) .map(MapFunction) .sink(Kines