[ 
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008977#comment-14008977
 ] 

Ashutosh Chauhan commented on HIVE-6809:
----------------------------------------

Api design is one concern. But my previous question was for another issue. 
Patch currently checks if partitions to be dropped have 'simplespec' (ie only 
string partition columns with equality). If it is then it drops all those 
partition in one api call, but if it isn't than it uses same api but drops in 
for-loop one by one. I would have assumed that in both cases we can do bulk 
drop. Can you explain why its better to drop one partition at a time if it is 
not a 'simpleSpec'?

> Support bulk deleting directories for partition drop with partial spec
> ----------------------------------------------------------------------
>
>                 Key: HIVE-6809
>                 URL: https://issues.apache.org/jira/browse/HIVE-6809
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>         Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, 
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than 
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec 
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false. 
> I couldn't test this in recent hive, which has HIVE-6256 but if the 
> time-taking part is mostly from removing directories, it seemed not helpful 
> to reduce whole processing time.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to