[ 
https://issues.apache.org/jira/browse/HIVE-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015706#comment-13015706
 ] 

He Yongqiang commented on HIVE-2089:
------------------------------------

Actually just found that the recent hadoop's combineFileInputFormat support not 
splittable files as input. So it won't be a problem for .gz files if the hadoop 
has the feature checked in.

Another use case for it is Hive's SymlinkInputFormat, which may point to too 
many .gz files.

> Add a new input format to be able to combine multiple .gz text files
> --------------------------------------------------------------------
>
>                 Key: HIVE-2089
>                 URL: https://issues.apache.org/jira/browse/HIVE-2089
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2089.1.patch
>
>
> For files that is not splittable, CombineHiveInputFormat won't help. This 
> jira is to add a new inputformat to support this feature. This is very useful 
> for partitions with tens of thousands of .gz files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to