[ 
https://issues.apache.org/jira/browse/HBASE-29519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18014676#comment-18014676
 ] 

Andor Molnar commented on HBASE-29519:
--------------------------------------

{quote}we need the {{hbase.replication.bulkload.enabled}} configuration enabled
{quote}
I think the intention of this HBase option is to leave it for the user to 
manually handle bulkload files. Like for replication, user might want to ask 
HBase to replicate the WAL entries only and if a bulkload operation happens on 
one cluster, which is a manual operation, user wants to manually go to the 
other cluster and do the same bulkload there. Since the order of bulkload and 
WAL doesn't matter, this could be more effective than relying on HBase to move 
around bulkload files.

Same for the backup. We don't need to enforce this option in any way. Give a 
warning message to the user when setting up continuous backup saying 'hey, be 
aware of that I won't backup or restore you bulkload files, you have to deal 
with them on your onw' - this should be enough. Bulkload files are probably in 
a safe place already (S3) and in case of a disaster, user should be able to 
manually restore them in an optimal way.

> Copy Bulkloaded Files in Continuous Backup
> ------------------------------------------
>
>                 Key: HBASE-29519
>                 URL: https://issues.apache.org/jira/browse/HBASE-29519
>             Project: HBase
>          Issue Type: Sub-task
>          Components: backup&restore
>            Reporter: Vinayak Hegde
>            Assignee: Vinayak Hegde
>            Priority: Major
>              Labels: pull-request-available
>
> Enhance the continuous backup replication endpoint to detect bulkload 
> operations and copy their HFiles to the backup location (e.g., S3). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to