[ https://issues.apache.org/jira/browse/HIVE-1806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12974713#action_12974713 ]
Namit Jain commented on HIVE-1806: ---------------------------------- Mostly looks good - a minor comment. In the new test that you added, the merge job is a map-only job although you are using HiveInputFormat This is because of the fact that you are using hadoop 20 which supports CombineHiveIF. Do you think that is the correct behavior ? Looks OK, just wanted to confirm. > The merge criteria on dynamic partitons should be per partiton > -------------------------------------------------------------- > > Key: HIVE-1806 > URL: https://issues.apache.org/jira/browse/HIVE-1806 > Project: Hive > Issue Type: Bug > Reporter: Ning Zhang > Assignee: Ning Zhang > Attachments: HIVE-1806.2.patch, HIVE-1806.3.patch, HIVE-1806.4.patch, > HIVE-1806.patch > > > Currently the criteria of whether a merge job should be fired on dynamic > generated partitions are is the average file size of files across all dynamic > partitions. It is very common that some dynamic partitions contains mostly > large files and some contains mostly small files. Even though the average > size of the total files are larger than the hive.merge.smallfiles.avgsize, we > should merge those partitions containing small files only. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.