Flume might be able to invoke Hive to do this as the data is ingested, but I don't know anything about Flume.
Brent has a nice blog post describing many of the details of partitioning. http://www.brentozar.com/archive/2013/03/introduction-to-hive-partitioning/ We also cover them in our book. The key steps to taking the file(s) you created and transforming them into partitioned data are the following: 1. Create an "external" table where the location is the directory you wrote that big HDFS file (or files). 2. Create the final target table with the partitioning, as described in Brent's blog post. 3. Run a query against the first table to populate the second. Again, Brent covers the details. See the Hive wiki for additional details on external tables, etc. Dean On Thu, Sep 12, 2013 at 7:55 PM, ch huang <justlo...@gmail.com> wrote: > hi,all: > i use flume collect log data and put it in hdfs ,i want to use > hive to do some caculate, query based on timerange,i want to use parttion > table , > but the data file in hdfs is a big file ,how can i put it into pratition > table in hive? > -- Dean Wampler, Ph.D. @deanwampler http://polyglotprogramming.com