[ https://issues.apache.org/jira/browse/HIVE-4690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13739194#comment-13739194 ]
Navis commented on HIVE-4690: ----------------------------- [~ashutoshc] I've investigated this a little. CFIF in 20/20S accepts "mapred.max.split.size". But in shims for 20/20S, it's ignored and only "mapred.min.split.size" is applied for all of them (see HadoopShimsSecure.getSplits()). Even if it's set (by manually) CFIF in 20/20S does not split a file under the size of the block, making one split. Shims for 23 uses same code with 20/20S but CFIF in 23 uses JobConf directly for retrieving configurations, and makes effect for that. And also it can split a file under the size of the block, making 22 splits. There should be a following issue for setting "mapred.max.split.size", etc. properly for CFIF. > stats_partscan_1.q makes different result with different hadhoop.mr.rev > ------------------------------------------------------------------------ > > Key: HIVE-4690 > URL: https://issues.apache.org/jira/browse/HIVE-4690 > Project: Hive > Issue Type: Sub-task > Affects Versions: 0.11.0 > Reporter: Navis > Assignee: Navis > Priority: Trivial > Attachments: HIVE-4690.D11163.1.patch > > > stats_partscan_1.q uses mapred.min/max.split.size and logs number of files, > which can be different with different hadoop.mr.rev. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira