[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan

Gunther Hagleitner (JIRA) Mon, 03 Mar 2014 16:18:13 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-6492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13918793#comment-13918793
 ]


Gunther Hagleitner commented on HIVE-6492:
------------------------------------------

[~selinazh] can you open a reviewboard request for this. I have a few more 
comments:

- Can you add a test for stats optimizer? I think since you're checking for 
explicit limit on fetch operator that would still bail (i.e.: select count(*) 
from foo with stats available and hive.compute.query.using.stats = true)
- Your patch only works in MR (since you're computing access at the physical 
level)
- We already have the pruned list of partitions available at the logical level

If you move your code to right after we call Optimizer.optimize in the 
SemanticAnalyzer you can make both cases work.

Logic should be:
- If there is a fetch operator at this level let it pass (no mapreduce job will 
be launched)
- Otherwise go through parse context's top ops and use opToPartPruner to find 
out how many partitions are going to be accessed.

Does that make sense?

> limit partition number involved in a table scan
> -----------------------------------------------
>
>                 Key: HIVE-6492
>                 URL: https://issues.apache.org/jira/browse/HIVE-6492
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>    Affects Versions: 0.12.0
>            Reporter: Selina Zhang
>             Fix For: 0.13.0
>
>         Attachments: HIVE-6492.1.patch.txt, HIVE-6492.2.patch.txt, 
> HIVE-6492.3.patch.txt
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> To protect the cluster, a new configure variable 
> "hive.limit.query.max.table.partition" is added to hive configuration to
> limit the table partitions involved in a table scan. 
> The default value will be set to -1 which means there is no limit by default. 
> This variable will not affect "metadata only" query.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HIVE-6492) limit partition number involved in a table scan

Reply via email to