[ https://issues.apache.org/jira/browse/HIVE-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13024883#comment-13024883 ]
He Yongqiang commented on HIVE-2126: ------------------------------------ The reason of using"ReworkMapredInputFormat" is that the interface "reworkMapred" can also be used by other formats in future, like some other file format also want to change the mapred work depends on the input. what do you think? > Hive's symlink text input format should be able to work with > ComineHiveInputFormat > ---------------------------------------------------------------------------------- > > Key: HIVE-2126 > URL: https://issues.apache.org/jira/browse/HIVE-2126 > Project: Hive > Issue Type: Improvement > Reporter: He Yongqiang > Assignee: He Yongqiang > Attachments: HIVE-2126.1.patch > > > at compile time, if a partition's file format is SymlinkTextInputFormat, will > replace the symlink path with paths in the symlink file. This way, it will > work with Hive's HiveCombineFileInputFormat. > The reason we are doing it at compile time is because: > 1) At run time, the input path is not only used to get record reader, but > also used for hive to get aliases and thus operator tree. But the > CombineHiveInputFormat can have multiple paths for each split, and when > switching paths, it also set the job with new input file name. So it always > require a real input path name. Can not fake it. > 2) if write a new input format, it will require a lot of duplication work > with existing CombineHiveInputFormat. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira