Hi Sreenath
Output compression is more useful on storage level, when a larger file is compressed it saves on hdfs blocks and there by the cluster become more scalable in terms of number of files. Yes lzo libraries needs to be there in all task tracker nodes as well the node that hosts the hive client. Regards Bejoy KS ________________________________ From: Sreenath Menon <sreenathmen...@gmail.com> To: user@hive.apache.org; Bejoy Ks <bejoy...@yahoo.com> Sent: Wednesday, June 6, 2012 3:25 PM Subject: Re: Compressed data storage in HDFS - Error Hi Bejoy I would like to make this clear. There is no gain on processing throughput/time on compressing the data stored in HDFS (not talking about intermediate compression)...wright?? And do I need to add the lzo libraries in Hadoop_Home/lib/native for all the nodes (including the slave nodes)??