OK I am getting a little confused now.
Consider that I am working on a scenario where there is no limit with
memory available.
In such scenario, is there any advantage of storing data in HDFS in
compressed format. Any advantage, like, if node 1 has data available and it
is executing a particular t
; Debarshi Basak
> Tata Consultancy Services
> Mailto: debarshi.ba...@tcs.com
> Website: http://www.tcs.com
>
> Experience certainty. IT Services
> Business Solutions
> Outsourcing
> ____
>
> -----Bejoy Ks wrote: -
>
Any idea about lzo or bzip2...any of these splittable??
___
>>> Experience certainty. IT Services
>>> Business Solutions
>>> Outsourcing
>>>
>>>
>>> -Bejoy Ks ** wrote: -**
>>>
>>> To: "user@hive.apache.org"
: http://www.tcs.com
>>>
>>> Experience certainty. IT Services
>>> Business Solutions
>>> Outsourcing
>>> ________________
>>>
>>> -Bejoy Ks wrote: -
>>>
>>> To: "user@hive.apache.org"
>&
Services
>> Mailto: debarshi.ba...@tcs.com
>> Website: http://www.tcs.com
>>
>> Experience certainty. IT Services
>> Business Solutions
>> Outsourcing
>>
>>
>> -Bejoy Ks ** wrote: -----**
>>
___
> Experience certainty. IT Services
> Business Solutions
> Outsourcing
>
>
> -Bejoy Ks ** wrote: -**
>
> To: "user@hive.apache.org"
> From: Bejoy Ks
> Date: 06/06/2012 03:37PM
> Subject: Re: C
ubject: Re: Compressed data storage in HDFS - Error
Hi BejoyI would like to make this clear.There is no gain on processing throughput/time on compressing the data stored in HDFS (not talking about intermediate compression)...wright??And do I need to add the lzo libraries in Hadoop_Home/lib/native for all
hive client.
Regards
Bejoy KS
From: Sreenath Menon
To: user@hive.apache.org; Bejoy Ks
Sent: Wednesday, June 6, 2012 3:25 PM
Subject: Re: Compressed data storage in HDFS - Error
Hi Bejoy
I would like to make this clear.
There is no gain on processing
Hi Bejoy
I would like to make this clear.
There is no gain on processing throughput/time on compressing the data
stored in HDFS (not talking about intermediate compression)...wright??
And do I need to add the lzo libraries in Hadoop_Home/lib/native for all
the nodes (including the slave nodes)??
: Sreenath Menon
To: user@hive.apache.org
Sent: Wednesday, June 6, 2012 3:08 PM
Subject: Re: Compressed data storage in HDFS - Error
Thanks for the response.
1)How do I use the Gz compression and does it come with Hadoop. Or else how do
I build a compression method for using in Hive. I would like
=true
mapred.map.output.compression.codec= hadoop.compression.lzo.LzoCodec
Regards
Bejoy KS
From: Siddharth Tiwari
To: "user@hive.apache.org "
Sent: Wednesday, June 6, 2012 2:58 PM
Subject: RE: Compressed data storage in HDFS - Error
There is something you gain and some
k...understood...so you load the compressed data into memory (thereby
decreasing the size of file needed to be loaded) and then apply
decompression algorithm to get the uncompressed data. is this what happens?
Thanks for the response.
1)How do I use the Gz compression and does it come with Hadoop. Or else how
do I build a compression method for using in Hive. I would like to run
evaluation across compression methods.
What is the default compression used in Hadoop.
2)Kindly bear with me if this question
There is something you gain and something you loose.
Compression would reduce IO through increased cpu work . Also you would receive
different experience for different tasks ie HDFS read , HDFS write , shuffle
and sort . So to go for compression or not depends on your usages .
Sent from my N8
Basically, when your data is compressed you have lesser IO than your uncompressd data. During job execution is doesn't decompress. It would be a relevant question in Hadoop's mailing list than hive.Debarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.com_
Yes performance is better because your IO is less when your data is lessDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing_
LZO doesn't ship with apache hadoop you need to build it..try GZDebarshi BasakTata Consultancy ServicesMailto: debarshi.ba...@tcs.comWebsite: http://www.tcs.comExperience certainty. IT ServicesBusiness SolutionsOutsourcing_
18 matches
Mail list logo