[ https://issues.apache.org/jira/browse/FLINK-29805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Caizhi Weng updated FLINK-29805: -------------------------------- Description: Table Store sink continuously fails with "Trying to add file which is already added" when snapshot committing is slow. This is due to a bug in {{FileStoreCommitImpl#filterCommitted}}. When this method finds an identifier, it removes the identifier from a map. However different snapshots may have the same identifier (for example an APPEND commit and the following COMPACT commit will have the same identifier), so we need to use another set to check for identifiers. When snapshot committing is fast there is at most 1 identifier to check after the job restarts, so nothing happens. However when snapshot committing is slow, there will be multiple identifiers to check and some identifiers will be mistakenly kept. was: Table Store sink continuously fails with "Trying to add file which is already added" when snapshot committing is slow. This is due to a bug in {{FileStoreCommitImpl#filterCommitted}}. When this method finds an identifier, it removes the identifier from a map. However different snapshots may have the same identifier (for example an APPEND commit and the following COMPACT commit will have the same identifier), so we need to use another set to check for identifiers. > Table Store sink continuously fails with "Trying to add file which is already > added" when snapshot committing is slow > --------------------------------------------------------------------------------------------------------------------- > > Key: FLINK-29805 > URL: https://issues.apache.org/jira/browse/FLINK-29805 > Project: Flink > Issue Type: Bug > Components: Table Store > Affects Versions: table-store-0.3.0, table-store-0.2.2 > Reporter: Caizhi Weng > Assignee: Caizhi Weng > Priority: Major > Fix For: table-store-0.3.0, table-store-0.2.2 > > > Table Store sink continuously fails with "Trying to add file which is already > added" when snapshot committing is slow. > This is due to a bug in {{FileStoreCommitImpl#filterCommitted}}. When this > method finds an identifier, it removes the identifier from a map. However > different snapshots may have the same identifier (for example an APPEND > commit and the following COMPACT commit will have the same identifier), so we > need to use another set to check for identifiers. > When snapshot committing is fast there is at most 1 identifier to check after > the job restarts, so nothing happens. However when snapshot committing is > slow, there will be multiple identifiers to check and some identifiers will > be mistakenly kept. -- This message was sent by Atlassian Jira (v8.20.10#820010)