[ https://issues.apache.org/jira/browse/HIVE-27158?focusedWorklogId=856663&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-856663 ]
ASF GitHub Bot logged work on HIVE-27158: ----------------------------------------- Author: ASF GitHub Bot Created on: 13/Apr/23 08:17 Start Date: 13/Apr/23 08:17 Worklog Time Spent: 10m Work Description: InvisibleProgrammer commented on code in PR #4131: URL: https://github.com/apache/hive/pull/4131#discussion_r1165171624 ########## ql/src/java/org/apache/hadoop/hive/ql/metadata/HiveStorageHandler.java: ########## @@ -245,6 +248,44 @@ default boolean canProvideBasicStatistics() { return false; } + /** + * Return some col statistics (Lower bounds, Upper bounds, Null value counts, NaN, total counts) calculated by + * the underlying storage handler implementation. + * @param table + * @return A List of Column Statistics Objects, can be null + */ + default List<ColumnStatisticsObj>getColStatistics(org.apache.hadoop.hive.ql.metadata.Table table) { + return null; + } + + /** + * Set column stats for non-native tables + * @param table + * @param colStats + * @return boolean + */ + default boolean setColStatistics(org.apache.hadoop.hive.ql.metadata.Table table, + List<ColumnStatistics> colStats) { + return false; + } + + /** + * Check if the storage handler can provide col statistics. + * @param tbl + * @return true if the storage handler can supply the col statistics + */ + default boolean canProvideColStatistics(org.apache.hadoop.hive.ql.metadata.Table tbl) { + return false; + } + + /** + * Check if the storage handler can set col statistics. + * @return true if the storage handler can set the col statistics + */ + default boolean canSetColStatistics(org.apache.hadoop.hive.ql.metadata.Table tbl) { Review Comment: I don't know the good answer, I'm just thinking: If we have a pair of methods like `canSetColStatistics` and `setColStatistics`. Can we do that in a way that doesn't allow to call `setColStatistics` if they cannot be set? Issue Time Tracking ------------------- Worklog Id: (was: 856663) Time Spent: 10h (was: 9h 50m) > Store hive columns stats in puffin files for iceberg tables > ----------------------------------------------------------- > > Key: HIVE-27158 > URL: https://issues.apache.org/jira/browse/HIVE-27158 > Project: Hive > Issue Type: Improvement > Reporter: Simhadri Govindappa > Assignee: Simhadri Govindappa > Priority: Major > Labels: pull-request-available > Time Spent: 10h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010)