Re: question about partition table in hive

2013-09-13 Thread Jagat Singh
Adding to Sanjay's reply The only thing left after flume has added partitions is to tell hive metastore to update partition information. which you can do via add partition command Then you can read data via hive straight away. On Sat, Sep 14, 2013 at 10:00 AM, Sanjay Subramanian < sanjay.subr

Re: question about partition table in hive

2013-09-13 Thread Sanjay Subramanian
A couple of days back, Erik Sammer at the Hadoop Hands On Lab at the Cloudera Sessions demonstrated how to achieve dynamic partitioning using Flume and created those partitioned directories on HDFS which are then readily usable by Hive Understanding what I can from the two lines of your mail be

Re: question about partition table in hive

2013-09-13 Thread Dean Wampler
Flume might be able to invoke Hive to do this as the data is ingested, but I don't know anything about Flume. Brent has a nice blog post describing many of the details of partitioning. http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/ We also cover them in our book. The

Re: question about partition table in hive

2013-09-13 Thread Stephen Sprague
and have you done any analysis on this yet using the Hive documentation that's publicly available? if you show some initiative yourself you're more likely to get others to join your cause. :) So what have you tried before asking us for help? On Thu, Sep 12, 2013 at 6:55 PM, ch huang wrote: >

Re: question about partition table in hive

2013-09-13 Thread Nitin Pawar
You will need to define a partition column like date or hour something like this. Then configure flume to rollover filee/directories based on your partition column. You will need some kind of cron which will check for the new data being available into a directory or file and then add it as partitio