[ https://issues.apache.org/jira/browse/HIVE-21193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
BELUGA BEHR reassigned HIVE-21193: ---------------------------------- Assignee: BELUGA BEHR > Support LZO Compression with CombineHiveInputFormat > --------------------------------------------------- > > Key: HIVE-21193 > URL: https://issues.apache.org/jira/browse/HIVE-21193 > Project: Hive > Issue Type: Improvement > Components: Compression > Affects Versions: 4.0.0, 3.2.0 > Reporter: BELUGA BEHR > Assignee: BELUGA BEHR > Priority: Major > > In regards to LZO compression with Hive... > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LZO > It does not work out of the box if there are {{.lzo.index}} files present. > As I understand it, this is because of the default Hive input format > {{CombineHiveInputFormat}} does not handle this correctly. It does not like > that there are a mix of data files and some index files, it lumps them > altogether when making the combined splits and Mappers fail when they try to > process the {{.lzo.index}} files as data. When using the original > {{HiveInputFormat}}, it correctly identifies the {{.lzo.index}} files because > it considers each file individually. > Allow {{CombineHiveInputFormat}} to short-circuit LZO files and to not > combine them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)