You can still partition the data. You'll have to run queries to add
partitions to the table, otherwise your table won't see a new partition, but
you'll have to do it regardless on what type of table you use.
We have a big cluster so I don't really see any change in performance, Hive
for this type
I use MR to generate tables using Elephant-Bird's OutputFormat. Hive
can read from EXTERNAL tables using ProtobufHiveSerde and
ProtobufBlockInputFormat generated by Elephant-Bird. Create table
statement looks like the following:
CREATE EXTERNAL TABLE IF NOT EXISTS TABLE_NAME
(
...
)
ROW FORMAT SER