[
https://issues.apache.org/jira/browse/HBASE-29149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18001380#comment-18001380
]
Vinayak Hegde commented on HBASE-29149:
---------------------------------------
Yeah, I believe this issue might also be causing the failures in the
Incremental Backup unit tests.
Hi [~hgromer] , just checking — if you're not actively working on this, would
it be okay if I take it up?
> WAL files can be archived during incremental backup process
> -----------------------------------------------------------
>
> Key: HBASE-29149
> URL: https://issues.apache.org/jira/browse/HBASE-29149
> Project: HBase
> Issue Type: Bug
> Reporter: Hernan Gelaf-Romer
> Assignee: Hernan Gelaf-Romer
> Priority: Major
>
> At my job, we've run into FNFE issues when WAL files are archived as they are
> being loaded to be converted into HFiles. When looking at the failure logs,
> we can see that the WAL was loaded just after the archive had occurred
> server-side.
>
> {quote}2025-02-24 17:10:34.333 [pool-124-thread-1] ERROR
> o.a.h.h.b.impl.TableBackupClient - Unexpected exception in
> incremental-backup: incremental copy backup_1740417014671File
> hdfs://nestor-hb2-a-qa:8020/hbase/WALs/na1-purple-dizzy-antelope.iad03.hubinternal.net,60020,1739996267893/na1-purple-dizzy-antelope.iad03.hubinternal.net%2C60020%2C1739996267893.1740412909549
> does not exist.
> java.io.FileNotFoundException: File
> hdfs://nestor-hb2-a-qa:8020/hbase/WALs/na1-purple-dizzy-antelope.iad03.hubinternal.net,60020,1739996267893/na1-purple-dizzy-antelope.iad03.hubinternal.net%2C60020%2C1739996267893.1740412909549
> does not exist.
> {quote}
>
> {quote}2025-02-24 17:10:17.787 Archiving
> hdfs://nestor-hb2-a-qa:8020/hbase/WALs/na1-purple-dizzy-antelope.iad03.hubinternal.net,60020,1739996267893/na1-purple-dizzy-antelope.iad03.hubinternal.net%2C60020%2C1739996267893.1740412909549
> to
> hdfs://nestor-hb2-a-qa:8020/hbase/oldWALs/na1-purple-dizzy-antelope.iad03.hubinternal.net%2C60020%2C1739996267893.1740412909549
> {quote}
>
> We already handle a similar situation when loading bulkloads, and add a
> re-try mechanism that checks the archive directory. We should probably do a
> similar thing here
--
This message was sent by Atlassian Jira
(v8.20.10#820010)