[ 
https://issues.apache.org/jira/browse/HIVE-12411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025967#comment-15025967
 ] 

Lefty Leverenz commented on HIVE-12411:
---------------------------------------

Doc note:  This changes *hive.stats.dbclass* (removing counter as a value) and 
removes *hive.stats.key.prefix.reserve.length* so the wiki needs to be updated 
for release 2.0.0.

* [Configuration Properties -- hive.stats.dbclass | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.dbclass]
* [Configuration Properties -- hive.stats.key.prefix.reserve.length | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-hive.stats.key.prefix.reserve.length]

The Statistics doc does not mention counter-based stats so no update is 
required, although an explanation of collection mechanisms would be a helpful 
addition.   *hive.stats.dbclass* is discussed in the Usage section.

* [Statistics in Hive | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev]
** [Implementation | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Implementation]
** [Usage | 
https://cwiki.apache.org/confluence/display/Hive/StatsDev#StatsDev-Usage]

> Remove counter based stats collection mechanism
> -----------------------------------------------
>
>                 Key: HIVE-12411
>                 URL: https://issues.apache.org/jira/browse/HIVE-12411
>             Project: Hive
>          Issue Type: Task
>          Components: Statistics
>    Affects Versions: 1.2.0, 1.2.1
>            Reporter: Pengcheng Xiong
>            Assignee: Pengcheng Xiong
>              Labels: TODOC2.0
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12411.01.patch, HIVE-12411.02.patch
>
>
> Following HIVE-12005, HIVE-12164, we have removed jdbc and hbase stats 
> collection mechanism. Now we are targeting counter based stats collection 
> mechanism. The main advantages are as follows (1) counter based stats has 
> limitation on the length of the counter itself, if it is too long, MD5 will 
> be applied. (2) when there are a large number of partitions and columns, we 
> need to create a large number of counters in memory. This will put a heavy 
> load on the M/R AM or Tez AM etc. FS based stats will do a better job.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to