nsivabalan commented on code in PR #18179:
URL: https://github.com/apache/hudi/pull/18179#discussion_r2908207279
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java:
##########
@@ -86,6 +89,12 @@ public class HoodieSparkEngineContext extends
HoodieEngineContext {
private final SQLContext sqlContext;
private final Map<HoodieDataCacheKey, List<Integer>> cachedRddIds = new
HashMap<>();
+ /**
+ * Map of all distributed registries created via getMetricRegistry().
+ * This map is passed to Spark executors to make the registries available
there.
+ */
+ private static final Map<String, Registry> DISTRIBUTED_REGISTRY_MAP = new
ConcurrentHashMap<>();
Review Comment:
when do we clean up entries from this map ?
##########
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java:
##########
@@ -67,10 +117,27 @@ static Registry getRegistry(String registryName, String
clazz) {
* @return {@link Map} of metrics name and value
*/
static Map<String, Long> getAllMetrics(boolean flush, boolean
prefixWithRegistryName) {
+ return getAllMetrics(flush, prefixWithRegistryName, Option.empty());
+ }
+
+ /**
+ * Get all registered metrics.
+ *
+ * If a Registry did not have a prefix in its name, the commonPrefix is
pre-pended to its name.
+ *
+ * @param flush clear all metrics after this operation.
+ * @param prefixWithRegistryName prefix each metric name with the registry
name.
+ * @param commonPrefix prefix to use if the registry name does not have a
prefix itself.
+ * @return {@link Map} of metrics name and value
+ */
+ static Map<String, Long> getAllMetrics(boolean flush, boolean
prefixWithRegistryName, Option<String> commonPrefix) {
Review Comment:
probably will briefly sync up with you on the entire patch.
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/util/DistributedRegistryUtil.java:
##########
@@ -0,0 +1,56 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hudi.util;
+
+import org.apache.hudi.common.engine.HoodieEngineContext;
+import org.apache.hudi.common.metrics.Registry;
+import org.apache.hudi.config.HoodieWriteConfig;
+import org.apache.hudi.hadoop.fs.HoodieWrapperFileSystem;
+
+/**
+ * Utility class for creating and initializing distributed metric registries.
+ */
+public class DistributedRegistryUtil {
+
+ /**
+ * Creates and sets metrics registries for HoodieWrapperFileSystem.
+ * When executor metrics are enabled, this creates distributed registries
that can collect
+ * metrics from Spark executors. Otherwise, it creates local registries.
+ *
+ * @param context The engine context
+ * @param config The write configuration
+ */
+ public static void createWrapperFileSystemRegistries(HoodieEngineContext
context, HoodieWriteConfig config) {
+ if (config.isMetricsOn()) {
+ Registry registry;
+ Registry registryMeta;
+ if (config.isExecutorMetricsEnabled()) {
+ // Create and set distributed registry for HoodieWrapperFileSystem
+ registry = context.getMetricRegistry(config.getTableName(),
HoodieWrapperFileSystem.REGISTRY_NAME);
Review Comment:
Can you help me understand, how we are initializing DistributedRegistry when
executor metrics are enabled?
##########
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java:
##########
@@ -18,45 +18,95 @@
package org.apache.hudi.common.metrics;
+import org.apache.hudi.common.util.Option;
import org.apache.hudi.common.util.ReflectionUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
import java.io.Serializable;
+import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
/**
- * Interface which defines a lightweight Metrics Registry to track Hudi events.
+ * Interface which defines a lightweight Metrics Registry to track Hudi
metrics.
+ *
+ * Registries can be used to track related metrics under a common name - e.g.
metrics for Metadata Table.
*/
public interface Registry extends Serializable {
+ Logger LOG = LoggerFactory.getLogger(Registry.class);
+
+ /**
+ * Separator used for compound registry keys (tableName + registryName).
+ */
+ String KEY_SEPARATOR = "::";
+ /**
+ * Map of all registries that have been created.
+ * The key is a compound of table name and registry name in format
"tableName::registryName".
+ * For non-table specific registries (i.e. common for all tables) the
tableName would be empty.
+ */
ConcurrentHashMap<String, Registry> REGISTRY_MAP = new ConcurrentHashMap<>();
/**
- * Get (or create) the registry for a provided name.
- *
- * This function creates a {@code LocalRegistry}.
+ * Creates a compound key from tableName and registryName.
+ */
+ static String makeKey(String tableName, String registryName) {
+ return tableName + KEY_SEPARATOR + registryName;
Review Comment:
can we try `indexOf` based parsing?
##########
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java:
##########
@@ -18,45 +18,95 @@
package org.apache.hudi.common.metrics;
+import org.apache.hudi.common.util.Option;
import org.apache.hudi.common.util.ReflectionUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
import java.io.Serializable;
+import java.util.Collection;
import java.util.HashMap;
import java.util.Map;
import java.util.concurrent.ConcurrentHashMap;
/**
- * Interface which defines a lightweight Metrics Registry to track Hudi events.
+ * Interface which defines a lightweight Metrics Registry to track Hudi
metrics.
+ *
+ * Registries can be used to track related metrics under a common name - e.g.
metrics for Metadata Table.
*/
public interface Registry extends Serializable {
+ Logger LOG = LoggerFactory.getLogger(Registry.class);
+
+ /**
+ * Separator used for compound registry keys (tableName + registryName).
+ */
+ String KEY_SEPARATOR = "::";
+ /**
+ * Map of all registries that have been created.
+ * The key is a compound of table name and registry name in format
"tableName::registryName".
+ * For non-table specific registries (i.e. common for all tables) the
tableName would be empty.
+ */
ConcurrentHashMap<String, Registry> REGISTRY_MAP = new ConcurrentHashMap<>();
/**
- * Get (or create) the registry for a provided name.
- *
- * This function creates a {@code LocalRegistry}.
+ * Creates a compound key from tableName and registryName.
+ */
+ static String makeKey(String tableName, String registryName) {
+ return tableName + KEY_SEPARATOR + registryName;
Review Comment:
I understand its not common to have "::" in table name. but this might break
if that happens right?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]