[ 
https://issues.apache.org/jira/browse/HIVE-7506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14086701#comment-14086701
 ] 

pengcheng xiong commented on HIVE-7506:
---------------------------------------

Hi Lars,

    This is my first patch and sorry for the inconvenience...

    The following are the answers:

    Could you update the existing review (linked in this issue now) instead of 
creating a new one. Again, if you need help let me know.

    So, I will always post in https://reviews.apache.org/r/24289/ in the future?

   The latest review is still not using spaces everywhere, there are also a lot 
of unrelated whitespace changes. If you're using IntelliJ I'm happy to help 
getting you set up

   Ans > I am using eclipse. I have set Projects->Property->Java Code 
Style->Formatter->Edit->

      Tab policy as spaces only, Tab size -> 2

      Is that enough?
   
    Could you comment on the authorization part? I'm not too sure about this 
myself.

    Ans >I have no idea about the authorization part either. I do not think I 
have made any changes on that.

    Having only taken a cursory look so far: Why did the fields in 
MTableColumnStatistics etc. change from primitives to boxed objects (long -> 
Long etc.)?

    Ans > The reason is that, they may be null during the process of 
MetadataUpdater. For example, the user can specify that the number of distinct 
value is "100". But he/she may not specify any value for the other status 
fields. So, in this case, those fields will be null. If we use "double" or 
"long" rather than "Double" or "Long", we can not specify it as null.

    I have uploaded a new patch, could you please take a look? Thanks!

> MetadataUpdater: provide a mechanism to edit the statistics of a column in a 
> table (or a partition of a table)
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-7506
>                 URL: https://issues.apache.org/jira/browse/HIVE-7506
>             Project: Hive
>          Issue Type: New Feature
>          Components: Database/Schema
>            Reporter: pengcheng xiong
>            Assignee: pengcheng xiong
>            Priority: Minor
>         Attachments: HIVE-7506.1.patch, HIVE-7506.1.patch, HIVE-7506.3.patch, 
> HIVE-7506.4.patch, HIVE-7506.patch
>
>   Original Estimate: 252h
>  Remaining Estimate: 252h
>
> Two motivations:
> (1) Cost-based Optimizer (CBO) depends heavily on the statistics of a column 
> in a table (or a partition of a table). If we would like to test whether CBO 
> chooses the best plan under different statistics, it would be time consuming 
> if we load the whole table and create the statistics from ground up.
> (2) As database runs,  the statistics of a column in a table (or a partition 
> of a table) may change. We need a way or a mechanism to synchronize. 
> We propose the following command to achieve that:
> ALTER TABLE table_name PARTITION partition_spec [COLUMN col_name] UPDATE 
> STATISTICS col_statistics [COMMENT col_comment]



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to