[ 
https://issues.apache.org/jira/browse/HIVE-12077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15455819#comment-15455819
 ] 

Thejas M Nair commented on HIVE-12077:
--------------------------------------

There is a bug in this patch. 
Lets say bactch_size = 5, and no of partitions = 9, it will skip the last 4 
partitions from being added.


> MSCK Repair table should fix partitions in batches 
> ---------------------------------------------------
>
>                 Key: HIVE-12077
>                 URL: https://issues.apache.org/jira/browse/HIVE-12077
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Ryan P
>            Assignee: Chinna Rao Lalam
>             Fix For: 2.2.0
>
>         Attachments: HIVE-12077.1.patch, HIVE-12077.2.patch, 
> HIVE-12077.3.patch, HIVE-12077.4.patch, HIVE-12077.5.patch
>
>
> If a user attempts to run MSCK REPAIR TABLE on a directory with a large 
> number of untracked partitions HMS will OOME. I suspect this is because it 
> attempts to do one large bulk load in an effort to save time. Ultimately this 
> can lead to a collection so large in size that HMS eventually hits an Out of 
> Memory Exception. 
> Instead I suggest that Hive include a configurable batch size that HMS can 
> use to break up the load. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to