Hello,

Quick question about LZO compression.

After reading the docs, it seems to me that I must
use DeprecatedLzoTextInputFormat in order to work with LZO files.

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO

However, I am not sure that is correct.  As I understand it, as long as
there is no LZO index files, and the LZO codec is installed, I should be
able to use the standard Hive table definitions.  The normal mapper
facilities will see the file's '.lzo' file extension and map it to the
compression codec for unpacking.  The specialized LZO TextInputFormat is
only required if there are a mix of '.lzo' and 'lzo..index' files in the
table.  The normal facilities would try to process both files because it
does not know that the 'index' file is not part of the normal data set; it
does not know that the index file is simply metadata.

Is this correct?

Reply via email to