[ https://issues.apache.org/jira/browse/HIVE-25596?focusedWorklogId=674358&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-674358 ]
ASF GitHub Bot logged work on HIVE-25596: ----------------------------------------- Author: ASF GitHub Bot Created on: 03/Nov/21 03:18 Start Date: 03/Nov/21 03:18 Worklog Time Spent: 10m Work Description: pkumarsinha commented on a change in pull request #2724: URL: https://github.com/apache/hive/pull/2724#discussion_r741594555 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/parse/repl/metric/MetricSink.java ########## @@ -116,14 +117,17 @@ public void run() { int totalMetricsSize = metrics.size(); List<ReplicationMetrics> replicationMetricsList = new ArrayList<>(totalMetricsSize); ObjectMapper mapper = new ObjectMapper(); + MessageEncoder encoder = MessageFactory.getDefaultInstanceForReplMetrics(conf); + MessageSerializer serializer = encoder.getSerializer(); for (int index = 0; index < totalMetricsSize; index++) { ReplicationMetric metric = metrics.removeFirst(); ReplicationMetrics persistentMetric = new ReplicationMetrics(); persistentMetric.setDumpExecutionId(metric.getDumpExecutionId()); persistentMetric.setScheduledExecutionId(metric.getScheduledExecutionId()); persistentMetric.setPolicy(metric.getPolicy()); - persistentMetric.setProgress(mapper.writeValueAsString(metric.getProgress())); - persistentMetric.setMetadata(mapper.writeValueAsString(metric.getMetadata())); + persistentMetric.setProgress(serializer.serialize(mapper.writeValueAsString(metric.getProgress()))); + persistentMetric.setMetadata(serializer.serialize(mapper.writeValueAsString(metric.getMetadata()))); Review comment: How does this justify a need to compress the metadata filed in that case? I think we should focus on the size in worst case and then see change post compression. That way we can decide on: a) whether we really need compressetion for metadata column b) if so, how much should the column size be. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 674358) Time Spent: 3h 20m (was: 3h 10m) > Compress Hive Replication Metrics while storing > ----------------------------------------------- > > Key: HIVE-25596 > URL: https://issues.apache.org/jira/browse/HIVE-25596 > Project: Hive > Issue Type: Improvement > Reporter: Haymant Mangla > Assignee: Haymant Mangla > Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > Compress the json fields of sys.replication_metrics table to optimise RDBMS > space usage. -- This message was sent by Atlassian Jira (v8.3.4#803005)