[
https://issues.apache.org/jira/browse/HIVE-6809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14009186#comment-14009186
]
Navis commented on HIVE-6809:
-----------------------------
1. In DDLTask(3942), multiple simple specs can be given and it would be
executed by iteration. But I thought it's not a common case.
{noformat}
for (Map<String, String> spec : simpleSpecs) {
List<Partition> dropped =
db.dropPartitions(tbl.getDbName(), tbl.getTableName(), toPartValues(tbl,
spec), true);
droppedParts.addAll(dropped);
}
{noformat}
2. In HiveMetastore(2247)
{noformat}
for (Partition part : parts) {
// copy values, which would be removed after drop
part.setValues(new ArrayList<String>(part.getValues()));
if (!ms.dropPartition(db_name, tbl_name, part.getValues())) {
throw new MetaException("Unable to drop partition");
}
}
{noformat}
Could be simply changed to
{noformat}
for (Partition part : parts) {
// copy values, which would be removed after drop
part.setValues(new ArrayList<String>(part.getValues()));
}
ms.dropPartitions(db_name, tbl_name, Arrays.asList(partName));
{noformat}
3. I've tried to use new API with DropPartitionsRequest but it's too
complicated and I couldn't fully convinced to convert simple List<String> into
ExprDescs and again into binary and vice versa. It might be possible to use
that but I felt uncomfortable.
> Support bulk deleting directories for partition drop with partial spec
> ----------------------------------------------------------------------
>
> Key: HIVE-6809
> URL: https://issues.apache.org/jira/browse/HIVE-6809
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Navis
> Assignee: Navis
> Attachments: HIVE-6809.1.patch.txt, HIVE-6809.2.patch.txt,
> HIVE-6809.3.patch.txt, HIVE-6809.4.patch.txt, HIVE-6809.5.patch.txt
>
>
> In busy hadoop system, dropping many of partitions takes much more time than
> expected. In hive-0.11.0, removing 1700 partitions by single partial spec
> took 90 minutes, which is reduced to 3 minutes when deleteData is set false.
> I couldn't test this in recent hive, which has HIVE-6256 but if the
> time-taking part is mostly from removing directories, it seemed not helpful
> to reduce whole processing time.
--
This message was sent by Atlassian JIRA
(v6.2#6252)