Re: Reading from ORC Files in HDFS

2017-12-19 Thread Allan Wilson
Had a feeling that would be the answer, but being new to Beam I wanted to make sure I wasn’t missing something. :) Thanks Ismael On 12/18/17, 3:07 AM, "Ismaël Mejía" wrote: >Hello, > >There is not support yet to read ORC files directly on Beam, You can >track the progress of this issue her

Re: Reading from ORC Files in HDFS

2017-12-18 Thread Ismaël Mejía
Hello, There is not support yet to read ORC files directly on Beam, You can track the progress of this issue here. https://issues.apache.org/jira/browse/BEAM-1861 You better use HCatalogIO than JdbcIO (the split should be better). On Mon, Dec 18, 2017 at 4:17 AM, Allan Wilson wrote: > Hi, >

Reading from ORC Files in HDFS

2017-12-17 Thread Allan Wilson
Hi, Is there anyway to read ORC files from HDFS directly using Apache Beam? I’m looking at loading up Kafka with data stored in ORC files backing Hive tables. After doing some research it doesn’t look possible, but I thought I ask to make sure. It may be possible to use jdbc or hcatalog to qu