How were the Parquet files you are trying to read generated? Same
version of libraries? I am successfully using the following Scala code
to read Parquet files using the HadoopInputFormat wrapper. Maybe try
that in Java?
val hadoopInputFormat =
new HadoopInputFormat[Void, GenericRecord](new
data to s3 if I do batch processing, but not
stream processing. Do you know what the difference is and why it would work for
one and not the other?
Sam
On Wed, Jan 11, 2017 at 12:40 PM, M. Dale wrote:
Sam, Don't point the variables at files, point them at the directories
containing the
parameterTool.getProperties()) );messageStream.print();
messageStream.writeAsText("s3: //flink-test/flinkoutputtest.
txt").setParallelism(1); env.execute();
On Tue, Jan 10, 2017 at 4:06 PM, M. Dale wrote:
Sam, I just happened to answer a similar question on Stackover
Sam, I just happened to answer a similar question on Stackoverflow at Does
Apache Flink AWS S3 Sink require Hadoop for local testing?. I also submitted a
PR to make that (for me) a little clearer on the Apache Flink documentation
(https://github.com/apache/flink/pull/3054/files).
|
|
|
I cloned the Apache Flink source code from
https://github.com/apache/flink and
want to build the 1.2-SNAPSHOT with Scala 2.11.
git clone g...@github.com:apache/flink.git
cd flink
git checkout remotes/origin/release-1.2
Following instructions from the 1.2 docs at
https://ci.apache.org/projects/f