[ https://issues.apache.org/jira/browse/HIVE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ning Zhang updated HIVE-1699: ----------------------------- Attachment: HIVE-1699.patch This patch includes the following changes: 1) correctly pruning partitions based on the partition specification in ANALYZE TABLE command. 2) adding a Hive.getPartitionsByNames() method to get a list of partitions based on their names. Previous we'll have to use Hive.getPartitions which get all partitions as Partition objects and then filter out partitions that doesn't satisfy spec. This is very expensive for tables with large number of partitions. This could be further improved by using the partition filtering pushdown feature once it is fully supported. 3) Caching the list of partitions in tableSpec so that StatsTask does not need to get the list of partitions again. 4) adding a explicit variable tableSpec to indicate its type (TABLE_ONLY, STATIC_PARTITION, DYNAMIC_PARTITION) rather than relying on implicit checking on partHandle. > incorrect partition pruning ANALYZE TABLE > ----------------------------------------- > > Key: HIVE-1699 > URL: https://issues.apache.org/jira/browse/HIVE-1699 > Project: Hadoop Hive > Issue Type: Bug > Reporter: Ning Zhang > Assignee: Ning Zhang > Attachments: HIVE-1699.patch > > > If table T is partitioned, ANALYZE TABLE T PARTITION (...) COMPUTE > STATISTICS; will gather stats for all partitions even though partition spec > only chooses a subset. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.