[
https://issues.apache.org/jira/browse/HIVE-28144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HIVE-28144:
----------------------------------
Labels: pull-request-available (was: )
> Remove overly verbose debug messages from MetastoreDirectSqlUtils
> -----------------------------------------------------------------
>
> Key: HIVE-28144
> URL: https://issues.apache.org/jira/browse/HIVE-28144
> Project: Hive
> Issue Type: Task
> Components: Standalone Metastore
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Labels: pull-request-available
>
> When BITVECTOR or KLL stats are disabled/not present in the metastore [the
> following
> message|https://github.com/apache/hive/blob/8eee4aa9d1bd6f4193471f5d014324bbaf552041/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L594]
> may appear way too often in the HMS logs.
> {noformat}
> 2024-03-22T01:50:57,849 DEBUG [CachedStore-CacheUpdateService: Thread-240]
> metastore.MetastoreDirectSqlUtils: Expected blob type but got java.lang.String
> {noformat}
> In fact in some cases, the message appears more than once for every single
> partition that is present in the table(s) being queried. When the number of
> partitions is important it can easily clog the logs with redundant and
> useless information.
> To put things in perspective while running the cbo_query10.q on the
> statistics of TPC-DS30TB dataset the message occupies more than 50% (26MB) of
> the total log file (46MB).
> {noformat}
> $mvn test -Dtest=TestTezTPCDS30TBPerfCliDriver -Dqfile=cbo_query10.q
> $grep -a "Expected blob type but got java.lang.String"
> target/tmp/log/hive.log | wc -c
> 26129538
> $ wc -c target/tmp/log/hive.log
> 46959003 target/tmp/log/hive.log
> {noformat}
> The presence of the message does not tells us much on its own. In conjunction
> with the code we can infer that we are not fetching BITVECTOR/KLL stats from
> the metastore but this could be done in a different place without having to
> print the same message 170K times.
>
> Removing this message saves disk space, avoids frequent log rotation, and
> improves the overall readability of the log file.
> There is another redundant
> message which appears when transforming a [database value to
> Boolean|https://github.com/apache/hive/blob/8eee4aa9d1bd6f4193471f5d014324bbaf552041/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetastoreDirectSqlUtils.java#L557].
> The message is redundant since it is followed directly by an exception so
> there is no reason to have both. This message may not appear as often as the
> previous one but given that it doesn't add much value it can also be removed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)