Thanks Savant. I believe this will hold good for .zip file also. Thank You, Manish.
From: Savant, Keshav [mailto:keshav.c.sav...@fisglobal.com] Sent: Thursday, September 27, 2012 10:19 AM To: user@hive.apache.org; manishbh...@rocketmail.com Subject: RE: zip file or tar file cosumption Manish the table that has been created for zipped text files should be defined as sequence file, for example CREATE TABLE my_table_zip(col1 STRING,col2 STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' stored as sequencefile; After this you can use regular load command to load these files, for example load data local inpath 'path-to-csv-file.gz' into table my_table_zip; hope this helps Keshav C Savant From: Manish Bhoge [mailto:manishbh...@rocketmail.com] Sent: Wednesday, September 26, 2012 9:43 PM To: user@hive.apache.org Subject: Re: zip file or tar file cosumption Hi Richin, Thanks! Yes this is what I wanted to understand how to load zip file to Hive table. Now, I'll try this option. Thank You, Manish. Sent from my BlackBerry, pls excuse typo ________________________________ From: <richin.j...@nokia.com<mailto:richin.j...@nokia.com>> Date: Wed, 26 Sep 2012 14:51:39 +0000 To: <user@hive.apache.org<mailto:user@hive.apache.org>> ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org> Subject: RE: zip file or tar file cosumption You are right Chuck. I thought his question was how to use zip files or any compressed files in Hive tables. Yeah, seems like you can't do that see: http://mail-archives.apache.org/mod_mbox/hive-user/201203.mbox/%3CCAENxBwxkF--3PzCkpz1HX21=gb9yvasr2jl0u3yul2tfgu0...@mail.gmail.com%3E But you can always compress your files in gzip format and they should be good to go. Richin From: ext Connell, Chuck [mailto:chuck.conn...@nuance.com] Sent: Wednesday, September 26, 2012 10:44 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: RE: zip file or tar file cosumption But TEXTFILE in Hive always has newline as the record delimiter. How could this possibly work with a zip/tar file that can contain ASCII 10 characters at random locations, and certainly does not have ASCII 10 at the end of each data record? Chuck Connell Nuance R&D Data Team Burlington, MA From: richin.j...@nokia.com<mailto:richin.j...@nokia.com> [mailto:richin.j...@nokia.com] Sent: Wednesday, September 26, 2012 10:14 AM To: user@hive.apache.org<mailto:user@hive.apache.org>; manishbh...@rocketmail.com<mailto:manishbh...@rocketmail.com> Subject: RE: zip file or tar file cosumption Hi Manish, If you have your zip file at location - /home/manish/zipfile, you can just point your external table to that location like CREATE EXTERNAL TABLE manish_test (field1 string, field2 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY <your_column_delimiter> STORED AS TEXTFILE LOCATION '/home/manish/zipfile'; OR If you already have external table pointing to a certain location you can load this zip file into your table as LOAD DATA INPATH '/home/manish/zipfile' INTO TABLE manish_test; Hope this helps. Richin From: ext Manish Bhoge [mailto:manishbh...@rocketmail.com] Sent: Wednesday, September 26, 2012 9:13 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Re: zip file or tar file cosumption Hi Savant, Got it. But I still need to understand that how to load zip? Can I directly use zip file in external table. can u pls help to get the load statement. Sent from my BlackBerry, pls excuse typo ________________________________ From: "Savant, Keshav" <keshav.c.sav...@fisglobal.com<mailto:keshav.c.sav...@fisglobal.com>> Date: Wed, 26 Sep 2012 12:25:38 +0000 To: user@hive.apache.org<user@hive.apache.org<mailto:user@hive.apache.org%3cu...@hive.apache.org>> ReplyTo: user@hive.apache.org<mailto:user@hive.apache.org> Cc: manish.bh...@target.com<manish.bh...@target.com<mailto:manish.bh...@target.com%3cmanish.bh...@target.com>>; chuck.conn...@nuance.com<chuck.conn...@nuance.com<mailto:chuck.conn...@nuance.com%3cchuck.conn...@nuance.com>> Subject: RE: zip file or tar file cosumption Another solution would be Using shell script do following 1. unzip txt files, 2. one by one merge those 50 (or N number of) text files into one text file, 3. then the zip/tar that bigger text file, 4. then that big zip/tar file can be uploaded into hive. Keshav C Savant From: Connell, Chuck [mailto:chuck.conn...@nuance.com]<mailto:[mailto:chuck.conn...@nuance.com]> Sent: Wednesday, September 26, 2012 4:04 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: RE: zip file or tar file cosumption This could be a problem. Hive uses newline as the record separator. A ZIP file will certainly newline characters. So I doubt this is possible. BUT, I would like to hear from anyone who has solved the "newline is always a record separator" problem, because we ran into it for another type of compressed file. Chuck ________________________________ From: Manish.Bhoge [manish.bh...@target.com] Sent: Wednesday, September 26, 2012 3:17 AM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: zip file or tar file cosumption Hivers, I want to understand that would it be possible to utilize zip/tar files directly into Hive. All the files has similar schema (structure). Say 50 *.txt files are zipped into a single zip file can we load data directly from this zip file OR should we need to unzip first? Thanks & Regards Manish Bhoge | Technical Architect * Target DW/BI| * +919379850010 (M) Ext: 5691 VOIP: 22165 | * "Excellence is not a skill, It is an attitude." MySite<http://mysites.target.com/personal/z063783> _____________ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you. _____________ The information contained in this message is proprietary and/or confidential. If you are not the intended recipient, please: (i) delete the message and all copies; (ii) do not disclose, distribute or use the message in any manner; and (iii) notify the sender immediately. In addition, please be aware that any message addressed to our domain is subject to archiving and review by persons other than the intended recipient. Thank you.