[ https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008977#comment-14008977 ]
Ashutosh Chauhan commented on HIVE-6809: ---------------------------------------- Api design is one concern. But my previous question was for another issue. Patch currently checks if partitions to be dropped have 'simplespec' (ie only string partition columns with equality). If it is then it drops all those partition in one api call, but if it isn't than it uses same api but drops in for-loop one by one. I would have assumed that in both cases we can do bulk drop. Can you explain why its better to drop one partition at a time if it is not a 'simpleSpec'? > Support bulk deleting directories for partition drop with partial spec > ---------------------------------------------------------------------- > > Key: HIVE-6809 > URL: https://issues.apache.org/jira/browse/HIVE-6809 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Reporter: Navis > Assignee: Navis > Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt, > HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt > > > In busy hadoop system, dropping many of partitions takes much more time than > expected. In hive-0.11.0, removing 1700 partitions by single partial spec > took 90 minutes, which is reduced to 3 minutes when deleteData is set false. > I couldn't test this in recent hive, which has HIVE-6256 but if the > time-taking part is mostly from removing directories, it seemed not helpful > to reduce whole processing time. -- This message was sent by Atlassian JIRA (v6.2#6252)