[ 
https://issues.apache.org/jira/browse/HIVE-25081?focusedWorklogId=608549&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-608549
 ]

ASF GitHub Bot logged work on HIVE-25081:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Jun/21 15:58
            Start Date: 08/Jun/21 15:58
    Worklog Time Spent: 10m 
      Work Description: klcopp commented on a change in pull request #2332:
URL: https://github.com/apache/hive/pull/2332#discussion_r647567066



##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Initiator.java
##########
@@ -120,7 +120,7 @@ public void run() {
         // don't doom the entire thread.
         try {
           handle = 
txnHandler.getMutexAPI().acquireLock(TxnStore.MUTEX_KEY.Initiator.name());
-          if (metricsEnabled) {
+          if (metricsEnabled && MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)) {

Review comment:
       Same as cleaner

##########
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##########
@@ -454,6 +454,8 @@ public static ConfVars getMetaConf(String name) {
         "hive.metastore.acidmetrics.check.interval", 300,
         TimeUnit.SECONDS,
         "Time in seconds between acid related metric collection runs."),
+    METASTORE_ACIDMETRICS_EXT_ON("metastore.acidmetrics.ext.on", 
"hive.metastore.acidmetrics.ext.on", true,
+        "Whether to collect additional acid related metrics outside of the 
acid metrics service."),

Review comment:
       I think these are only enabled if `MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METRICS_ENABLED)==true` , so it would be good to mention 
that in the description

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java
##########
@@ -115,41 +117,45 @@ public static DeltaFilesMetricReporter getInstance() {
     return InstanceHolder.instance;
   }
 
-  public static synchronized void init(HiveConf conf){
+  public static synchronized void init(HiveConf conf) {
     getInstance().configure(conf);
   }
 
   public void submit(TezCounters counters) {
-    updateMetrics(NUM_OBSOLETE_DELTAS,
-        obsoleteDeltaCache, obsoleteDeltaTopN, obsoleteDeltasThreshold, 
counters);
-    updateMetrics(NUM_DELTAS,
-        deltaCache, deltaTopN, deltasThreshold, counters);
-    updateMetrics(NUM_SMALL_DELTAS,
-        smallDeltaCache, smallDeltaTopN, deltasThreshold, counters);
+    if(acidMetricsExtEnabled) {
+      updateMetrics(NUM_OBSOLETE_DELTAS,
+          obsoleteDeltaCache, obsoleteDeltaTopN, obsoleteDeltasThreshold, 
counters);
+      updateMetrics(NUM_DELTAS,
+          deltaCache, deltaTopN, deltasThreshold, counters);
+      updateMetrics(NUM_SMALL_DELTAS,
+          smallDeltaCache, smallDeltaTopN, deltasThreshold, counters);
+    }
   }
 
-  public static void mergeDeltaFilesStats(AcidDirectory dir, long 
checkThresholdInSec,
-        float deltaPctThreshold, EnumMap<DeltaFilesMetricType, Map<String, 
Integer>> deltaFilesStats) throws IOException {
-    long baseSize = getBaseSize(dir);
-    int numObsoleteDeltas = getNumObsoleteDeltas(dir, checkThresholdInSec);
+  public static void mergeDeltaFilesStats(AcidDirectory dir, long 
checkThresholdInSec, float deltaPctThreshold,
+      EnumMap<DeltaFilesMetricType, Map<String, Integer>> deltaFilesStats, 
Configuration conf) throws IOException {
+    if (MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)) {

Review comment:
       Instead of adding the check here, it makes a bit more sense to add it to 
these checks in 
org.apache.hadoop.hive.ql.io.orc.OrcInputFormat#generateSplitsInfo:
   ```
   if (metricsEnabled && directory instanceof AcidDirectory) {
             DeltaFilesMetricReporter.mergeDeltaFilesStats((AcidDirectory) 
directory, checkThresholdInSec,
                 deltaPctThreshold, deltaFilesStats);
           }
   ```
   

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##########
@@ -111,7 +111,7 @@ public void run() {
         // so wrap it in a big catch Throwable statement.
         try {
           handle = 
txnHandler.getMutexAPI().acquireLock(TxnStore.MUTEX_KEY.Cleaner.name());
-          if (metricsEnabled) {
+          if (metricsEnabled && MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)) {

Review comment:
       I think this is the same logic as `metricsEnabled  = 
   MetastoreConf.getBoolVar(conf, MetastoreConf.ConfVars.METRICS_ENABLED) && 
MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)`
   right?

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java
##########
@@ -115,41 +117,45 @@ public static DeltaFilesMetricReporter getInstance() {
     return InstanceHolder.instance;
   }
 
-  public static synchronized void init(HiveConf conf){
+  public static synchronized void init(HiveConf conf) {
     getInstance().configure(conf);
   }
 
   public void submit(TezCounters counters) {
-    updateMetrics(NUM_OBSOLETE_DELTAS,
-        obsoleteDeltaCache, obsoleteDeltaTopN, obsoleteDeltasThreshold, 
counters);
-    updateMetrics(NUM_DELTAS,
-        deltaCache, deltaTopN, deltasThreshold, counters);
-    updateMetrics(NUM_SMALL_DELTAS,
-        smallDeltaCache, smallDeltaTopN, deltasThreshold, counters);
+    if(acidMetricsExtEnabled) {

Review comment:
       It makes more sense to add this check to 
org.apache.hadoop.hive.ql.exec.tez.TezTask#execute instead, so all the checks 
are in one place:
   ```
             if (HiveConf.getBoolVar(conf, 
HiveConf.ConfVars.HIVE_SERVER2_METRICS_ENABLED)) {
               DeltaFilesMetricReporter.getInstance().submit(dagCounters);
             }
   ```

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java
##########
@@ -230,23 +240,26 @@ private static long getDirSize(AcidUtils.ParsedDirectory 
dir, FileSystem fs) thr
       .sum();
   }
 
-  private void configure(HiveConf conf){
-    deltasThreshold = HiveConf.getIntVar(conf, 
HiveConf.ConfVars.HIVE_TXN_ACID_METRICS_DELTA_NUM_THRESHOLD);
-    obsoleteDeltasThreshold = HiveConf.getIntVar(conf, 
HiveConf.ConfVars.HIVE_TXN_ACID_METRICS_OBSOLETE_DELTA_NUM_THRESHOLD);
-
-    initMetricsCache(conf);
-    long reportingInterval = HiveConf.getTimeVar(conf,
-        HiveConf.ConfVars.HIVE_TXN_ACID_METRICS_REPORTING_INTERVAL, 
TimeUnit.SECONDS);
-
-    ThreadFactory threadFactory =
-      new ThreadFactoryBuilder()
-        .setDaemon(true)
-        .setNameFormat("DeltaFilesMetricReporter %d")
-        .build();
-    executorService = 
Executors.newSingleThreadScheduledExecutor(threadFactory);
-    executorService.scheduleAtFixedRate(
-        new ReportingTask(), 0, reportingInterval, TimeUnit.SECONDS);
-    LOG.info("Started DeltaFilesMetricReporter thread");
+  private void configure(HiveConf conf) {
+    acidMetricsExtEnabled = MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON);
+    if (acidMetricsExtEnabled) {

Review comment:
       It would be nicer to include this check in HiveServer2#init here:
   ```
         if (hiveConf.getBoolVar(ConfVars.HIVE_SERVER2_METRICS_ENABLED)) {
           MetricsFactory.init(hiveConf);
           DeltaFilesMetricReporter.init(hiveConf);
         }
   ```

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java
##########
@@ -190,13 +197,16 @@ public static void 
createCountersForAcidMetrics(TezCounters tezCounters, JobConf
   }
 
   public static void addAcidMetricsToConfObj(EnumMap<DeltaFilesMetricType, 
Map<String, Integer>> deltaFilesStats, Configuration conf) {
-    deltaFilesStats.forEach((type, value) ->
-        conf.set(type.name(), 
Joiner.on(",").withKeyValueSeparator("->").join(value))
-    );
+    if (MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)) {

Review comment:
       I guess this doesn't have a check for `HiveConf.getBoolVar(jobConf, 
HiveConf.ConfVars.HIVE_SERVER2_METRICS_ENABLED)` because the deltaFilesStats 
map is empty if metrics are off. I think you can remove this check.

##########
File path: 
ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/metrics/DeltaFilesMetricReporter.java
##########
@@ -206,7 +216,7 @@ public static void backPropagateAcidMetrics(JobConf 
jobConf, Configuration conf)
   }
 
   public static void close() {
-    if (getInstance() != null) {
+    if (getInstance() != null && getInstance().acidMetricsExtEnabled) {

Review comment:
       I don't think this is necessary... if the instance exists it should be 
shut down, that's all...

##########
File path: ql/src/java/org/apache/hadoop/hive/ql/txn/compactor/Cleaner.java
##########
@@ -111,7 +111,7 @@ public void run() {
         // so wrap it in a big catch Throwable statement.
         try {
           handle = 
txnHandler.getMutexAPI().acquireLock(TxnStore.MUTEX_KEY.Cleaner.name());
-          if (metricsEnabled) {
+          if (metricsEnabled && MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)) {

Review comment:
       This is the same logic as 
   `metricsEnabled = MetastoreConf.getBoolVar(conf, 
MetastoreConf.ConfVars.METASTORE_ACIDMETRICS_EXT_ON)
   && MetastoreConf.getBoolVar(conf, MetastoreConf.ConfVars.METRICS_ENABLED)` 
(which would be easier to read) right?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 608549)
    Time Spent: 20m  (was: 10m)

> Put metrics collection behind a feature flag
> --------------------------------------------
>
>                 Key: HIVE-25081
>                 URL: https://issues.apache.org/jira/browse/HIVE-25081
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Antal Sinkovits
>            Assignee: Antal Sinkovits
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Most metrics we're creating are collected in AcidMetricsService, which is 
> behind a feature flag. However there are some metrics that are collected 
> outside of the service. These should be behind a feature flag in addition to 
> hive.metastore.metrics.enabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to