[
https://issues.apache.org/jira/browse/HBASE-29519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18014677#comment-18014677
]
Andor Molnar commented on HBASE-29519:
--------------------------------------
{quote}Also, at present, we process WAL entries as they arrive, write them to
the WAL file, and immediately upload the bulkloaded files. Later, when the WAL
file is full, we close it, which ensures it’s fully persisted. However, if we
upload the bulkloaded files but fail to close the WAL file, we may reprocess
those entries (since the offset won’t move until the WAL is closed) and upload
them again. This would overwrite the files, which might not be a major issue.
{quote}
Another reason for the user to deal with backing up bulkload files manually.
Are u able to check if the bulkload file is already present at the backup
location due to a previous successful upload? If yes, do this first and skip
uploading it again. Only upload if the file is missing or the upload has failed
previously.
> Copy Bulkloaded Files in Continuous Backup
> ------------------------------------------
>
> Key: HBASE-29519
> URL: https://issues.apache.org/jira/browse/HBASE-29519
> Project: HBase
> Issue Type: Sub-task
> Components: backup&restore
> Reporter: Vinayak Hegde
> Assignee: Vinayak Hegde
> Priority: Major
> Labels: pull-request-available
>
> Enhance the continuous backup replication endpoint to detect bulkload
> operations and copy their HFiles to the backup location (e.g., S3).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)