Adding to Sanjay's reply
The only thing left after flume has added partitions is to tell hive
metastore to update partition information.
which you can do via
add partition command
Then you can read data via hive straight away.
On Sat, Sep 14, 2013 at 10:00 AM, Sanjay Subramanian <
sanjay.subr
A couple of days back, Erik Sammer at the Hadoop Hands On Lab at the Cloudera
Sessions demonstrated how to achieve dynamic partitioning using Flume and
created those partitioned directories on HDFS which are then readily usable by
Hive
Understanding what I can from the two lines of your mail be
Flume might be able to invoke Hive to do this as the data is ingested, but
I don't know anything about Flume.
Brent has a nice blog post describing many of the details of partitioning.
http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/
We also cover them in our book. The
and have you done any analysis on this yet using the Hive documentation
that's publicly available?
if you show some initiative yourself you're more likely to get others to
join your cause. :)
So what have you tried before asking us for help?
On Thu, Sep 12, 2013 at 6:55 PM, ch huang wrote:
>
You will need to define a partition column like date or hour something like
this.
Then configure flume to rollover filee/directories based on your partition
column.
You will need some kind of cron which will check for the new data being
available into a directory or file and then add it as partitio