I use MR to generate tables using Elephant-Bird's OutputFormat. Hive
can read from EXTERNAL tables using ProtobufHiveSerde and
ProtobufBlockInputFormat generated by Elephant-Bird. Create table
statement looks like the following:

CREATE EXTERNAL TABLE IF NOT EXISTS TABLE_NAME
(
...
)
ROW FORMAT SERDE 'elephantbird.proto.hive.serde.LzoXXXProtobufHiveSerde'
STORED AS
inputformat 
'elephantbird.proto.mapred.input.DeprecatedLzoXXXProtobufBlockInputFormat'
outputformat 'org.apache.hadoop.mapred.SequenceFileOutputFormat'
LOCATION '/PATH';

So the solution is to use external tables.

Let me know if it helps.

On Thu, Sep 1, 2011 at 8:45 PM, Matias Silva <msi...@specificmedia.com> wrote:
> Hi Everyone, is there any documentation regarding importing
> GoogleProtocolBuffer files into Hive.  I'm scouring over the internet
> and the closest thing I came
> across http://search-hadoop.com/m/9zF4MEW5Od1/v=plain
> I saw something from Elephant-Bird where I can load the GPB file using pig
> and then store it in a plain text format and then load
> into Hive.  It would be great if I can just load from GPB directly into
> Hive.
> Any pointers?
> Thanks for your time and knowledge,
> Matt
>
>

Reply via email to