Thanks Does this mean I need to create a partition for each day manually? There is no way to have infer that from my directory structure?
On Mar 29, 2013, at 10:40 AM, Sanjay Subramanian <sanjay.subraman...@wizecommerce.com> wrote: > Hi > > CREATE EXTERNAL TABLE IF NOT EXISTS log_data(col1 datatype1, col2 > datatype2, . . . colN datatypeN) PARTITIONED BY (YEAR INT, MONTH INT, DAY > INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t'; > > > ALTER table log_data ADD PARTITION (YEAR=2013 , MONTH=2, DAY=27) LOCATION > '/path/to/YEAR/MONTH/DAY/directory/ON/HDFS';" > > Hive will read gzip and bz2 files out of the box.(so suppose you had > hourly log files in gzip format in your /YEAR/MONTH/DAY directory then it > will be read) > Snappy and LZO will need some jar installs and configs > https://github.com/toddlipcon/hadoop-lzo > > https://code.google.com/p/snappy/ > > > Note that for example - gzip format is not splittable..so huge gzip files > without splits are not recommended as input to maps > > Hope this helps > > sanjay > > > On 3/29/13 10:19 AM, "Mark" <static.void....@gmail.com> wrote: > >> We have existing log data in directories in the format of YEAR/MONTH/DAY. >> >> - How can we create a table over this table without hive modifying and/or >> moving it? >> - How can we tell Hive to partition this data so it knows about each day >> of logs? >> - Does hive out of the box work with reading compressed files? >> >> Thanks > > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the sender > by reply email and destroy all copies of the original message along with any > attachments, from your computer system. If you are the intended recipient, > please be advised that the content of this message is subject to access, > review and disclosure by the sender's Email System Administrator. >