I'm not seeing any documentation link in Sanjay's message, so here it is
again (in the Hive wiki's language manual):
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO.


On Thu, Aug 8, 2013 at 3:30 PM, Sanjay Subramanian <
sanjay.subraman...@wizecommerce.com> wrote:

>  Please refer this documentation here
> Let me know if u need more clarifications so that we can make this
> document better and complete
>
>  Thanks
>
>  sanjay
>
>   From: w00t w00t <w00...@yahoo.de>
> Reply-To: "user@hive.apache.org" <user@hive.apache.org>, w00t w00t <
> w00...@yahoo.de>
> Date: Thursday, August 8, 2013 2:02 AM
> To: "user@hive.apache.org" <user@hive.apache.org>
> Subject: Hive and Lzo Compression
>
>
>    Hello,
>
> I am started to run Hive with Lzo compression on Hortonworks 1.2
>
> I have managed to install/configure Lzo and  hive -e "set
> io.compression.codecs" shows me the Lzo Codecs:
> io.compression.codecs=
> org.apache.hadoop.io.compress.GzipCodec,
> org.apache.hadoop.io.compress.DefaultCodec,
> com.hadoop.compression.lzo.LzoCodec,
> com.hadoop.compression.lzo.LzopCodec,
> org.apache.hadoop.io.compress.BZip2Codec
>
> However, I have some questions where I would be happy if you could help me.
>
> (1) CREATE TABLE statement
>
>  I read in different postings, that in the CREATE TABLE statement, I have
> to use the following STORAGE clause:
>
>  CREATE EXTERNAL TABLE txt_table_lzo (
>     txt_line STRING
>  )
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'
>  STORED AS INPUTFORMAT 'com.hadoop.mapred.DeprecatedLzoTextInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
>  LOCATION '/user/myuser/data/in/lzo_compressed';
>
>  It works withouth any problems now to execute SELECT statements on this
> table with Lzo data.
>
>  However I also created a table on the same data without this STORAGE
> clause:
>
>  CREATE EXTERNAL TABLE txt_table_lzo_tst (
>     txt_line STRING
>  )
>  ROW FORMAT DELIMITED FIELDS TERMINATED BY '||||'
>  LOCATION '/user/myuser/data/in/lzo_compressed';
>
>  The interesting thing is, it works as well, when I execute a SELECT
> statement and this table.
>
>  Can you help, why the second CREATE TABLE statement works as well?
>  What should I use in DDLs?
>  Is it best practice to use the STORED AS clause with a
> "deprecatedLzoTextInputFormat"? Or should I remove it?
>
>
> (2) Output and Intermediate Compression Settings
>
>  I want to use output compression .
>
>  In "Programming Hive" from Capriolo, Wampler, Rutherglen the following
> commands are recommended:
>  SET hive.exec.compress.output=true;
>  SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;
>
>           However, in some other places in forums, I found the following
> recommended settings:
>  SET hive.exec.compress.output=true
>  SET mapreduce.output.fileoutputformat.compress=true
>  SET
> mapreduce.output.fileoutputformat.compress.codec=com.hadoop.compression.lzo.LzopCodec
>
>  Am I right, that the first settings are for Hadoop versions prior 0.23?
>  Or is there any other reason why the settings are different?
>
>  I am using Hadoop 1.1.2 with Hive 0.10.0.
>  Which settings would you recommend to use?
>
>  --------------
>           I also want to compress intermediate results.
>
>          Again, in  "Programming Hive" the following settings are
> recommended:
>          SET hive.exec.compress.intermediate=true;
>          SET
> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;
>
>           Is this the right setting?
>
>           Or should I again use the settings (which look more valid for
> Hadoop 0.23 and greater)?:
>           SET hive.exec.compress.intermediate=true;
>           SET
> mapreduce.map.output.compression.codec=com.hadoop.compression.lzo.LzopCodec;
>
> Thanks
>
>
>
>
> CONFIDENTIALITY NOTICE
> ======================
> This email message and any attachments are for the exclusive use of the
> intended recipient(s) and may contain confidential and privileged
> information. Any unauthorized review, use, disclosure or distribution is
> prohibited. If you are not the intended recipient, please contact the
> sender by reply email and destroy all copies of the original message along
> with any attachments, from your computer system. If you are the intended
> recipient, please be advised that the content of this message is subject to
> access, review and disclosure by the sender's Email System Administrator.
>



-- Lefty

Reply via email to