[ 
https://issues.apache.org/jira/browse/HIVE-7438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Brock Noland updated HIVE-7438:
-------------------------------

    Description: 
Hive makes use of MapReduce counters for statistics and possibly for other 
purposes. For Hive on Spark, we should achieve the same functionality using 
Spark's accumulators.

Hive also collects metrics from MapReduce jobs traditionally. Spark job very 
likely publishes a different set of metrics, which, if made available, would 
help user to get insights into their spark jobs. Thus, we should obtain the 
metrics and make them available as we do for MapReduce.

This task therefore includes:

#. identify Hive's existing functionality w.r.t. counters, statistics, and 
metrics;
#. design and implement the same functionality in Spark.

Please refer to the design document for more information. 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics

  was:
Hive makes use of MapReduce counters for statistics and possibly for other 
purposes. For Hive on Spark, we should achieve the same functionality using 
Spark's accumulators.

Hive also collects metrics from MapReduce jobs traditionally. Spark job very 
likely publishes a different set of metrics, which, if made available, would 
help user to get insights into their spark jobs. Thus, we should obtain the 
metrics and make them available as we do for MapReduce.

This task therefore includes 1. identify Hive's existing functionality w.r.t. 
counters, statistics, and metrics; 2. design and implement the same 
functionality in Spark.

Please refer to the design document for more information. 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics


> Counters, statistics, and metrics
> ---------------------------------
>
>                 Key: HIVE-7438
>                 URL: https://issues.apache.org/jira/browse/HIVE-7438
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chengxiang Li
>         Attachments: hive on spark job statistic design.docx
>
>
> Hive makes use of MapReduce counters for statistics and possibly for other 
> purposes. For Hive on Spark, we should achieve the same functionality using 
> Spark's accumulators.
> Hive also collects metrics from MapReduce jobs traditionally. Spark job very 
> likely publishes a different set of metrics, which, if made available, would 
> help user to get insights into their spark jobs. Thus, we should obtain the 
> metrics and make them available as we do for MapReduce.
> This task therefore includes:
> #. identify Hive's existing functionality w.r.t. counters, statistics, and 
> metrics;
> #. design and implement the same functionality in Spark.
> Please refer to the design document for more information. 
> https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark#HiveonSpark-CountersandMetrics



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to