[ https://issues.apache.org/jira/browse/HIVE-8456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14173247#comment-14173247 ]
Rui Li commented on HIVE-8456: ------------------------------ I'm not familiar how the counter/accumulator works. Just a few high level questions: 1. Shall we think of better names for the new classes? Because the naming (e.g. SparkCounterGroup and SparkCounters) seems a little bit confusing to me. 2. Have we defined all the counters in {{SparkCounters.initializeSparkCounters}}? For example, it seems {{Operator.HIVECOUNTERFATAL}} isn't added there. 3. The Counter enum in operators doesn't seem to be used as "Counter" in hive. Rather, it's just kept in {{statsMap : HashMap<Enum<?>, LongWritable>}}. Maybe we shouldn't add them as SparkCounter? If we do want to wrap them as SparkCounter, there're other operators to handle other than MapOperator, e.g. FilterOperator and JoinOperator also have such an enum. 4. Maybe we should always use {{HiveConf.ConfVars.HIVECOUNTERGROUP}} as the group name, rather than the enum class name ({{key.getDeclaringClass().getName()}})? > Support Hive Counter to collect spark job metric[Spark Branch] > -------------------------------------------------------------- > > Key: HIVE-8456 > URL: https://issues.apache.org/jira/browse/HIVE-8456 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Chengxiang Li > Assignee: Chengxiang Li > Labels: Spark-M3 > Attachments: HIVE-8456.1-spark.patch, HIVE-8456.2-spark.patch > > > Several Hive query metric in Hive operators is collected by Hive Counter, > such as CREATEDFILES and DESERIALIZE_ERRORS, Besides, Hive use Counter as an > option to collect table stats info. Spark support Accumulator which is > pretty similiar with Hive Counter, we could try to enable Hive Counter based > on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)