Documentation is here https://cwiki.apache.org/confluence/display/Hive/CompressedStorage. Performance overhead is trivial for larger amounts of data but may be magnified as data size gets smaller. Typically where you gain is data transfers between nodes and disk reads/writes. Again, the larger the data size the more the gain.
Thanks. From: Sachin Sudarshana [mailto:sachin.had...@gmail.com] Sent: Sunday, June 9, 2013 11:04 PM To: user@hive.apache.org Subject: Compression in Hive Hi, I have been testing the usefulness of compression in Hive. I have a general question, I would like to know if there are any particular cases where compression in hive can actually prove useful while running any MR jobs. Any pointers/examples would really be useful! Thank you, Sachin