[ 
https://issues.apache.org/jira/browse/HIVE-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ning Zhang updated HIVE-1699:
-----------------------------

    Attachment: HIVE-1699.patch

This patch includes the following changes:

  1) correctly pruning partitions based on the partition specification in 
ANALYZE TABLE command.
  2) adding a Hive.getPartitionsByNames() method to get a list of partitions 
based on their names. Previous we'll have to use Hive.getPartitions which get 
all partitions as Partition objects and then filter out partitions that doesn't 
satisfy spec. This is very expensive for tables with large number of 
partitions. This could be further improved by using the partition filtering 
pushdown feature once it is fully supported. 
  3) Caching the list of partitions in tableSpec so that StatsTask does not 
need to get the list of partitions again. 
  4) adding a explicit variable tableSpec to indicate its type (TABLE_ONLY, 
STATIC_PARTITION, DYNAMIC_PARTITION) rather than relying on implicit checking 
on partHandle. 

> incorrect partition pruning ANALYZE TABLE
> -----------------------------------------
>
>                 Key: HIVE-1699
>                 URL: https://issues.apache.org/jira/browse/HIVE-1699
>             Project: Hadoop Hive
>          Issue Type: Bug
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-1699.patch
>
>
> If table T is partitioned, ANALYZE TABLE T PARTITION (...) COMPUTE 
> STATISTICS; will gather stats for all partitions even though partition spec 
> only chooses a subset. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to