Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/23810 )

Change subject: IMPALA-14647: Fix truncate for replicated txn tables always 
delete data
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/23810/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/23810/1//COMMIT_MSG@14
PS1, Line 14: In HiveServer, there is a configuration, 
"hive.acid.truncate.usebase",
That's a good point! Just realized "hive.acid.truncate.usebase" can be used at 
session level. I think the current patch is a bug fix rather than a breaking 
change. For the env that only uses Impala and not Hive, replication is not 
enabled so the modified code path is not used.

> I am unsure about how to configure this - does Impala use Hive configuration 
> at other places?

Yes, here are some examples:
https://github.com/apache/impala/blob/6f3deabb9d0c0ca98956316bbcc31e14a3363804/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java#L2949-L2951
https://github.com/apache/impala/blob/6f3deabb9d0c0ca98956316bbcc31e14a3363804/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java#L96-L97
https://github.com/apache/impala/blob/6f3deabb9d0c0ca98956316bbcc31e14a3363804/fe/src/main/java/org/apache/impala/catalog/MetaStoreClientPool.java#L169-L170
https://github.com/apache/impala/blob/6f3deabb9d0c0ca98956316bbcc31e14a3363804/fe/src/main/java/org/apache/impala/catalog/iceberg/IcebergCatalogs.java#L153-L155
https://github.com/apache/impala/blob/3a5a6f612a332fc509cfdc73c4566356a00ac730/fe/src/main/java/org/apache/impala/catalog/paimon/PaimonUtil.java#L258-L260
https://github.com/apache/impala/blob/3a5a6f612a332fc509cfdc73c4566356a00ac730/fe/src/main/java/org/apache/impala/common/TransactionKeepalive.java#L216-L217

But none of them are configurable at session level. I think we need a query 
option to support this.

> I would also consider always calling the HMS API to ensure that things work 
> the same way in replicated/non-replicated cases.

Yeah, that would make the code cleaner. It will introduces two changes but I 
don't think they are breaking changes:
* Impala creates a new base_* dir with an empty file (named "empty") for each 
partition. BTW, the file name should be "_empty" so can be skipped. Hive 
creates a non-empty "_metadata_acid" file but will be treated as hidden file in 
Impala.
* Impala doesn't update the stats. Hive updates the stats for truncate.

I'll change the patch when we reach a consensus.



--
To view, visit http://gerrit.cloudera.org:8080/23810
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia31991baeb2ef8717c387b841b65cff562dbcae0
Gerrit-Change-Number: 23810
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Sai Hemanth Gantasala <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Tue, 30 Dec 2025 02:56:08 +0000
Gerrit-HasComments: Yes

Reply via email to