[ 
https://issues.apache.org/jira/browse/IMPALA-13173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Noémi Pap-Takács updated IMPALA-13173:
--------------------------------------
    Priority: Minor  (was: Major)

> Redundant Catalog Update Check in Coordinator
> ---------------------------------------------
>
>                 Key: IMPALA-13173
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13173
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Noémi Pap-Takács
>            Assignee: Noémi Pap-Takács
>            Priority: Minor
>              Labels: impala-iceberg
>
> In case of DML operations, the Coordinator sends an update to the Catalog 
> about the files changed in the table. Before sending the update, we check if 
> any file was created. If no files were added or deleted, we skip the catalog 
> update. See the logic in _'DmlExecState::PrepareCatalogUpdate'._
> However, in case of unpartitioned Iceberg tables, the check in 
> _'DmlExecState::PrepareCatalogUpdate'_ always returns true, and updates the 
> Catalog even if no files were added. Currently, this does not cause incorrect 
> behavior because the presence of created files is double-checked later in 
> client-request-state.cc.
> On the other hand, there are cases, when not writing any files does not equal 
> a NO-OP. For example overwriting a table with empty content or an OPTIMIZE 
> TABLE that merges delete files. The Catalog needs to be informed about the 
> changes in such cases.
> We should filter NO-OP DMLs correctly in the Coordinator, eliminating false 
> positive and false negative updates as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to