[ https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427659#comment-15427659 ]
Lefty Leverenz commented on HIVE-12077: --------------------------------------- Doc note: HIVE-14571 tracks documenting the new configuration parameter *hive.msck.repair.batch.size*. > MSCK Repair table should fix partitions in batches > --------------------------------------------------- > > Key: HIVE-12077 > URL: https://issues.apache.org/jira/browse/HIVE-12077 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Ryan P > Assignee: Chinna Rao Lalam > Labels: TODOC2.2 > Fix For: 2.2.0 > > Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, > HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch > > > If a user attempts to run MSCK REPAIR TABLE on a directory with a large > number of untracked partitions HMS will OOME. I suspect this is because it > attempts to do one large bulk load in an effort to save time. Ultimately this > can lead to a collection so large in size that HMS eventually hits an Out of > Memory Exception. > Instead I suggest that Hive include a configurable batch size that HMS can > use to break up the load. -- This message was sent by Atlassian JIRA (v6.3.4#6332)