Re: Reading Binary Data (Matrix) with Flink

Chiwan Park Tue, 19 Jan 2016 22:32:14 -0800

Hi Saliya,

You can use the input format from Hadoop in Flink by using readHadoopFile 
method. The method returns a dataset which of type is Tuple2<Key, Value>. Note 
that MapReduce equivalent transformation in Flink is composed of map, groupBy, 
and reduceGroup.


> On Jan 20, 2016, at 3:04 PM, Suneel Marthi <smar...@apache.org> wrote:
> 
> Guess u r looking for Flink's BinaryInputFormat to be able to read blocks of 
> data from HDFS
> 
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/api/common/io/BinaryInputFormat.html
> 
> On Wed, Jan 20, 2016 at 12:45 AM, Saliya Ekanayake <esal...@gmail.com> wrote:
> Hi,
> 
> I am trying to use Flink perform a parallel batch operation on a NxN matrix 
> represented as a binary file. Each (i,j) element is stored as a Java Short 
> value. In a typical MapReduce programming with Hadoop, each map task will 
> read a block of rows of this matrix and perform computation on that block and 
> emit result to the reducer.
> 
> How is this done in Flink? I am new to Flink and couldn't find a binary 
> reader so far. Any help is greatly appreciated.
> 
> Thank you,
> Saliya
> 
> -- 
> Saliya Ekanayake
> Ph.D. Candidate | Research Assistant
> School of Informatics and Computing | Digital Science Center
> Indiana University, Bloomington
> Cell 812-391-4914
> http://saliya.org
> 

Regards,
Chiwan Park

Re: Reading Binary Data (Matrix) with Flink

Reply via email to