Flume might be able to invoke Hive to do this as the data is ingested, but
I don't know anything about Flume.

Brent has a nice blog post describing many of the details of partitioning.

http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/

We also cover them in our book. The key steps to taking the file(s) you
created and transforming them into partitioned data are the following:

1. Create an "external" table where the location is the directory you wrote
that big HDFS file (or files).
2. Create the final target table with the partitioning, as described in
Brent's blog post.
3. Run a query against the first table to populate the second. Again, Brent
covers the details.

See the Hive wiki for additional details on external tables, etc.

Dean



On Thu, Sep 12, 2013 at 7:55 PM, ch huang <justlo...@gmail.com> wrote:

> hi,all:
>         i use flume collect log data and put it in hdfs ,i want to use
> hive to do some caculate, query based on timerange,i want to use parttion
> table ,
> but the data file in hdfs is a big file ,how can i put it into pratition
> table in hive?
>



-- 
Dean Wampler, Ph.D.
@deanwampler
http://polyglotprogramming.com

Reply via email to