[ https://issues.apache.org/jira/browse/HIVE-27356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17723889#comment-17723889 ]
Simhadri Govindappa edited comment on HIVE-27356 at 5/18/23 10:56 AM: ---------------------------------------------------------------------- Sure. {quote} Currently it writes a non-standard blob (Standard blob types are listed [here|https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java]). I think it would be better to write standard blobs for interoperability. But if Hive wants to write non-standard blobs anyway, it should still come up with a descriptive name for them, e.g. 'hive-column-statistics-v1'. {quote} The initial design we went with col stats object . We can easily change this to a different blob type. was (Author: simhadri-g): Sure > Hive should write name of blob type instead of table name in Puffin > ------------------------------------------------------------------- > > Key: HIVE-27356 > URL: https://issues.apache.org/jira/browse/HIVE-27356 > Project: Hive > Issue Type: Bug > Reporter: Zoltán Borók-Nagy > Priority: Major > > Currently Hive writes the name of the table plus snapshot id as blob type: > [https://github.com/apache/hive/blob/aa1e067033ef0b5468f725cfd3776810800af96d/iceberg/iceberg-handler/src/main/java/org/apache/iceberg/mr/hive/HiveIcebergStorageHandler.java#L422] > Instead, it should write the name of the blog it writes. Table name and > snapshot id are redundant information anyway, as they can be inferred from > the location and filename of the puffin file. > Currently it writes a non-standard blob (Standard blob types are listed > [here|https://github.com/apache/iceberg/blob/master/core/src/main/java/org/apache/iceberg/puffin/StandardBlobTypes.java]). > I think it would be better to write standard blobs for interoperability. But > if Hive wants to write non-standard blobs anyway, it should still come up > with a descriptive name for them, e.g. 'hive-column-statistics-v1'. -- This message was sent by Atlassian Jira (v8.20.10#820010)