anmolnar commented on code in PR #7222:
URL: https://github.com/apache/hbase/pull/7222#discussion_r2282994951
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/replication/ContinuousBackupReplicationEndpoint.java:
##########
@@ -381,6 +430,48 @@ private String formatToDateString(long dayInMillis) {
return dateFormat.format(new Date(dayInMillis));
}
+ private Path getBulkLoadFileStagingPath(Path relativePathFromNamespace)
throws IOException {
+ FileSystem rootFs = CommonFSUtils.getRootDirFileSystem(conf);
+ Path rootDir = CommonFSUtils.getRootDir(conf);
+ Path baseNSDir = new Path(HConstants.BASE_NAMESPACE_DIR);
+ Path baseNamespaceDir = new Path(rootDir, baseNSDir);
+ Path hFileArchiveDir =
+ new Path(rootDir, new Path(HConstants.HFILE_ARCHIVE_DIRECTORY,
baseNSDir));
+
+ LOG.debug("{} Searching for bulk load file: {} in paths: {}, {}",
Utils.logPeerId(peerId),
+ relativePathFromNamespace, baseNamespaceDir, hFileArchiveDir);
+
+ Path result =
+ findExistingPath(rootFs, baseNamespaceDir, hFileArchiveDir,
relativePathFromNamespace);
+
+ if (result == null) {
+ LOG.error("{} No bulk loaded file found in relative path: {}",
Utils.logPeerId(peerId),
+ relativePathFromNamespace);
+ throw new IOException(
Review Comment:
You might want to throw a different type of exception than what
`FileUtil.copy(...)` operation throws.
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java:
##########
@@ -190,6 +191,14 @@ private void handleContinuousBackup(Admin admin) throws
IOException {
// set overall backup status: complete. Here we make sure to complete the
backup.
// After this checkpoint, even if entering cancel process, will let the
backup finished
backupInfo.setState(BackupState.COMPLETE);
+
+ if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) {
+ System.out.println("NOTE: Bulkload replication is not enabled. "
+ + "Bulk loaded files will not be backed up as part of continuous
backup. "
+ + "To ensure bulk loaded files are included in the backup, please
enable bulkload replication "
+ + "(hbase.replication.bulkload.enabled=true) and configure other
necessary settings "
+ + "to properly enable bulkload replication.");
Review Comment:
Message is confusing. "You" are HBase, so please tell the user _exactly_
what needs to be done in order to enable bulkload backup.
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java:
##########
@@ -190,6 +191,14 @@ private void handleContinuousBackup(Admin admin) throws
IOException {
// set overall backup status: complete. Here we make sure to complete the
backup.
// After this checkpoint, even if entering cancel process, will let the
backup finished
backupInfo.setState(BackupState.COMPLETE);
+
+ if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) {
+ System.out.println("NOTE: Bulkload replication is not enabled. "
+ + "Bulk loaded files will not be backed up as part of continuous
backup. "
Review Comment:
```
Since continuous backup is using HBase replication, bulk loaded files won't
be backup up as part of continuous backup.
```
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/replication/ContinuousBackupReplicationEndpoint.java:
##########
@@ -372,6 +385,42 @@ private void close() {
}
}
+ private void uploadBulkLoadFiles(long dayInMillis, List<Path> bulkLoadFiles)
throws IOException {
+ LOG.debug("{} Starting upload of {} bulk load files",
Utils.logPeerId(peerId),
+ bulkLoadFiles.size());
+
+ if (LOG.isTraceEnabled()) {
+ LOG.trace("{} Bulk load files to upload: {}", Utils.logPeerId(peerId),
+
bulkLoadFiles.stream().map(Path::toString).collect(Collectors.joining(", ")));
+ }
+ String dayDirectoryName = formatToDateString(dayInMillis);
+ Path bulkloadDir = new Path(backupFileSystemManager.getBulkLoadFilesDir(),
dayDirectoryName);
+ backupFileSystemManager.getBackupFs().mkdirs(bulkloadDir);
+
+ for (Path file : bulkLoadFiles) {
+ Path sourcePath = getBulkLoadFileStagingPath(file);
+ Path destPath = new Path(bulkloadDir, file);
+
+ try {
+ LOG.debug("{} Copying bulk load file from {} to {}",
Utils.logPeerId(peerId), sourcePath,
+ destPath);
+
+ FileUtil.copy(CommonFSUtils.getRootDirFileSystem(conf), sourcePath,
+ backupFileSystemManager.getBackupFs(), destPath, false, conf);
+
+ LOG.info("{} Bulk load file {} successfully backed up to {}",
Utils.logPeerId(peerId), file,
+ destPath);
+ } catch (IOException e) {
+ String errorMsg = String.format("%s Failed to back up bulk load file
%s to %s",
+ Utils.logPeerId(peerId), file, destPath);
+ LOG.error("{}: {}", errorMsg, e.getMessage(), e);
+ throw new IOException(errorMsg, e);
Review Comment:
Before returning from a failed upload operation you probably want to make
sure there was no garbage left at the destination. For instance, if continuing
the upload is not supported, the partially uploaded file should be deleted.
##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/FullTableBackupClient.java:
##########
@@ -190,6 +191,14 @@ private void handleContinuousBackup(Admin admin) throws
IOException {
// set overall backup status: complete. Here we make sure to complete the
backup.
// After this checkpoint, even if entering cancel process, will let the
backup finished
backupInfo.setState(BackupState.COMPLETE);
+
+ if (!conf.getBoolean(REPLICATION_BULKLOAD_ENABLE_KEY, false)) {
+ System.out.println("NOTE: Bulkload replication is not enabled. "
Review Comment:
```
WARNING: ...
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]