[ 
https://issues.apache.org/jira/browse/HIVE-5921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838099#comment-13838099
 ] 

Prasanth J commented on HIVE-5921:
----------------------------------

FILTER rule is improved to evaluate each predicate expression. JOIN rule is 
improved to get hints from user in form of hive config. In absence of basic 
statistics (row count and data size), estimated row count/data size is computed 
from average row size which is computed from schema. Regenerated all affecting 
tests.

> Better heuristics for worst case statistics estimates for join, limit and 
> filter operator
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-5921
>                 URL: https://issues.apache.org/jira/browse/HIVE-5921
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Query Processor, Statistics
>    Affects Versions: 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>             Fix For: 0.13.0
>
>         Attachments: HIVE-5921.1.patch
>
>
> This is a subtask of HIVE-5369. In worst case (i.e; absence of column 
> statistics) HIVE-5849 improved the basic statistics with heuristics. But the 
> heuristics failed to provide better estimates in few cases. For example: 
> FILTER operator heuristics did not take into account the number of predicates 
> and if the predicate contains partition column. Also, JOIN estimates were too 
> aggressive and was not user configurable.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to