Re: Google Protocol Buffers and Hive

2011-09-02 Thread valentina kroshilina
You can still partition the data. You'll have to run queries to add partitions to the table, otherwise your table won't see a new partition, but you'll have to do it regardless on what type of table you use. We have a big cluster so I don't really see any change in performance, Hive for this type

Re: Google Protocol Buffers and Hive

2011-09-02 Thread valentina kroshilina
I use MR to generate tables using Elephant-Bird's OutputFormat. Hive can read from EXTERNAL tables using ProtobufHiveSerde and ProtobufBlockInputFormat generated by Elephant-Bird. Create table statement looks like the following: CREATE EXTERNAL TABLE IF NOT EXISTS TABLE_NAME ( ... ) ROW FORMAT SER