Re: Reading ORC format on Flink

Chiwan Park Wed, 27 Jan 2016 18:32:34 -0800

Hi Phil,

I think that you can read ORC file using OrcInputFormat [1] with readHadoopFile 
method.


There is an example on MapReduce [2] in Stackoveflow. The approach works also 
on Flink. Maybe you have to use RichMapFunction [3] to initialize OrcSerde and 
StructObjectInspector object.

Regards,
Chiwan Park

[1]: 
https://hive.apache.org/javadocs/r0.13.1/api/ql/org/apache/hadoop/hive/ql/io/orc/OrcInputFormat.html
[2]: 
http://stackoverflow.com/questions/22673222/how-do-you-use-orcfile-input-output-format-in-mapreduce
[3]: 
https://ci.apache.org/projects/flink/flink-docs-release-0.10/api/java/org/apache/flink/api/common/functions/RichMapFunction.html

> On Jan 28, 2016, at 4:44 AM, Philip Lee <philjj...@gmail.com> wrote:
> 
> Hello, 
> 
> Question about reading ORC format on Flink.
> 
> I want to use dataset after loadtesting csv to orc format by Hive.
> Can Flink support reading ORC format?
> 
> If so, please let me know how to use the dataset in Flink.
> 
> Best,
> Phil
> 
> 
> 
>

Re: Reading ORC format on Flink

Reply via email to