Hi Vince,

External tables shouldn't issue copy or move commands to your data files.  You 
should define the base table location to '/logs', and issue alter table 
commands to add partitions for each date.

Example:

CREATE EXTERNAL TABLE logs (
Data STRING
) PARTITIONED BY (cal_date STRING)
ROW FORMAT DELIMITED FIELDS
TERMINATED BY '\t' LINES TERMINATED BY '\n'
LOCATION '/logs';

ALTER TABLE logs ADD IF NOT EXISTS PARTITION (cal_date = '2011-09-01') LOCATION 
'log-2011-09-01';

Matt Tucker
Associate eBusiness Analyst
Walt Disney Parks and Resorts Online
Ph: 407-566-2545
Tie: 8-296-2545

From: Vince Hoang [mailto:vho...@cafepress.com]
Sent: Thursday, December 08, 2011 3:47 PM
To: user@hive.apache.org
Subject: Partitioning EXTERNAL TABLE without copying or moving files

Hi,

I am running Hive 0.7.0 with Hadoop 0.20.2.  I have one HDFS folder full of web 
server logs dated back several months.

Is possible to partition an EXTERNAL TABLE without copying/moving files or 
altering the layout of the directory?

For example, in HDFS, I have:

> /logs/log-2011-09-01
> /logs/log-2011-09-02
>   ...
> /logs/log-2011-12-01

I'd like to know if it's possible to partition the EXTERNAL TABLE by date 
without having to create subdirectories:

> /logs/2011-09-01/log-2011-09-01
> /logs/2011-09-02/log-2011-09-02
>   ...
> /logs/2011-12-01/log-2011-12-01

Is it possible?

Thanks,
Vince


The contents of this message, together with any attachments, are intended only 
for the use of the individual or entity to which they are addressed and may 
contain information that is confidential and exempt from disclosure. If you are 
not the intended recipient, you are hereby notified that any dissemination, 
distribution, or copying of this message, or any attachment, is strictly 
prohibited. If you have received this message in error, please notify the 
original sender immediately by telephone or by return E-mail and delete this 
message, along with any attachments, from your computer. Thank you.

Reply via email to