prashantwason commented on code in PR #18179:
URL: https://github.com/apache/hudi/pull/18179#discussion_r2914897732
##########
hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/client/common/HoodieSparkEngineContext.java:
##########
@@ -86,6 +89,12 @@ public class HoodieSparkEngineContext extends
HoodieEngineContext {
private final SQLContext sqlContext;
private final Map<HoodieDataCacheKey, List<Integer>> cachedRddIds = new
HashMap<>();
+ /**
+ * Map of all distributed registries created via getMetricRegistry().
+ * This map is passed to Spark executors to make the registries available
there.
+ */
+ private static final Map<String, Registry> DISTRIBUTED_REGISTRY_MAP = new
ConcurrentHashMap<>();
Review Comment:
The \`DISTRIBUTED_REGISTRY_MAP\` is static and lives for the lifetime of the
JVM. This is intentional — registries are long-lived singletons that correspond
to Spark accumulators (which are also tied to the SparkContext lifetime). Since
Hudi tables are typically long-lived in a Spark application, the number of
entries is bounded by the number of distinct (tableName, registryName) pairs,
which is small (currently 2 registries per table: \`HoodieWrapperFileSystem\`
and \`HoodieWrapperFileSystemMetaFolder\`).
If we need cleanup for cases where tables are dropped/removed, we could add
a \`removeMetricRegistry()\` method in a follow-up, but for now the memory
footprint is negligible.
##########
hudi-io/src/main/java/org/apache/hudi/common/metrics/Registry.java:
##########
@@ -67,10 +117,27 @@ static Registry getRegistry(String registryName, String
clazz) {
* @return {@link Map} of metrics name and value
*/
static Map<String, Long> getAllMetrics(boolean flush, boolean
prefixWithRegistryName) {
+ return getAllMetrics(flush, prefixWithRegistryName, Option.empty());
+ }
+
+ /**
+ * Get all registered metrics.
+ *
+ * If a Registry did not have a prefix in its name, the commonPrefix is
pre-pended to its name.
+ *
+ * @param flush clear all metrics after this operation.
+ * @param prefixWithRegistryName prefix each metric name with the registry
name.
+ * @param commonPrefix prefix to use if the registry name does not have a
prefix itself.
+ * @return {@link Map} of metrics name and value
+ */
+ static Map<String, Long> getAllMetrics(boolean flush, boolean
prefixWithRegistryName, Option<String> commonPrefix) {
Review Comment:
Sure, happy to sync up. Let me know when works for you.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]