Re: StreamingFileSink on EMR

2019-02-26 Thread kb
Thanks! This fixed it. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: StreamingFileSink on EMR

2019-02-26 Thread Bruno Aranda
Hi, That Jar must exist for all the 1.7 versions, but I was replacing the libs for the Flink provided by the AWS EMR (1.7.0) by the more recent ones. But you could download the 1.7.0 distribution and copy the flink-s3-fs-hadoop-1.7.0.jar from there into the /usr/lib/flink/lib folder. But knowing

Re: StreamingFileSink on EMR

2019-02-26 Thread kb
Hi, So 1.7.2 jar has the fix? Thanks Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: StreamingFileSink on EMR

2019-02-26 Thread Bruno Aranda
Hey, Got it working, basically you need to add the flink-s3-fs-hadoop-1.7.2.jar libraries from the /opt folder of the flink distribution into the /usr/lib/flink/lib. That has done the trick for me. Cheers, Bruno On Tue, 26 Feb 2019 at 16:28, kb wrote: > Hi Bruno, > > Thanks for verifying. We

Re: StreamingFileSink on EMR

2019-02-26 Thread kb
Hi Bruno, Thanks for verifying. We are aiming for the same. Best, Kevin -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: StreamingFileSink on EMR

2019-02-26 Thread Bruno Aranda
Hi, I am having the same issue, but it is related to what Kostas is pointing out. I was trying to stream to the "s3" scheme and not "hdfs", and then getting that exception. I have realised that somehow I need to reach the S3RecoverableWriter, and found out it is in a difference library "flink-s3-

Re: StreamingFileSink on EMR

2019-02-26 Thread Kostas Kloudas
Hi Kevin, I cannot find anything obviously wrong from what you describe. Just to eliminate the obvious, you are specifying "hdfs" as the scheme for your file path, right? Cheers, Kostas On Tue, Feb 26, 2019 at 3:35 PM Till Rohrmann wrote: > Hmm good question, I've pulled in Kostas who worked o

Re: StreamingFileSink on EMR

2019-02-26 Thread Till Rohrmann
Hmm good question, I've pulled in Kostas who worked on the StreamingFileSink. He might be able to tell you more in case that there is some special behaviour wrt the Hadoop file systems. Cheers, Till On Tue, Feb 26, 2019 at 3:29 PM kb wrote: > Hi Till, > > The only potential issue in the path I

Re: StreamingFileSink on EMR

2019-02-26 Thread kb
Hi Till, The only potential issue in the path I see is `/usr/share/aws/emr/emrfs/lib/emrfs-hadoop-assembly-2.29.0.jar`. I double checked my pom, the project is Hadoop-free. The JM log also shows `INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Hadoop version: 2.8.5-amzn-1`.

Re: StreamingFileSink on EMR

2019-02-26 Thread Till Rohrmann
Hi Kevin, could you check what's on the class path of the Flink cluster? You should see this in the jobmanager.log at the top. It seems as if there is a Hadoop dependency with a lower version. Flink 1.7 is build against which Hadoop version? You should make sure that you either use the Hadoop-free

StreamingFileSink on EMR

2019-02-25 Thread Bohinski, Kevin (Contractor)
When running Flink 1.7 on EMR 5.21 using StreamingFileSink we see java.lang.UnsupportedOperationException: Recoverable writers on Hadoop are only supported for HDFS and for Hadoop version 2.7 or newer. EMR is showing Hadoop version 2.8.5. Is anyone else seeing this issue?