[ 
https://issues.apache.org/jira/browse/HIVE-1648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934619#action_12934619
 ] 

Namit Jain commented on HIVE-1648:
----------------------------------

I haven't taken a look at the code, but here are the comments for the tests


Instead of:


desc extended <table_name> in the tests,
please use
show table extended like `<table_name>`;


This will dump stats in a new line and can be easily compared.
The non-deterministic stats are ignored.


Add a test for limit in the sub-query.

Dont select from existing tables: src/src1 for your stats tests.
Create new tables and then set hive.stats.autogather.read to true.
This was, you are sure that the remaining tests will not be affected.

Add another test for 3-way join where the join keys are not the same: something 
like:

select .. from A join B on A.key1 = B.key1 join C on B.key2 = C.key2 where ....


> Automatically gathering stats when reading a table/partition
> ------------------------------------------------------------
>
>                 Key: HIVE-1648
>                 URL: https://issues.apache.org/jira/browse/HIVE-1648
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Ning Zhang
>            Assignee: Paul Butler
>         Attachments: HIVE-1648.2.patch, HIVE-1648.3.patch, HIVE-1648.patch
>
>
> HIVE-1361 introduces a new command 'ANALYZE TABLE T COMPUTE STATISTICS' to 
> gathering stats. This requires additional scan of the data. Stats gathering 
> can be piggy-backed on TableScanOperator whenever a table/partition is 
> scanned (given not LIMIT operator). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to