Re: [PR] HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process [hbase]

via GitHub Mon, 25 Aug 2025 16:17:29 -0700


taklwu commented on code in PR #7239:
URL: https://github.com/apache/hbase/pull/7239#discussion_r2299337985



##########
hbase-backup/src/main/java/org/apache/hadoop/hbase/backup/impl/BackupCommands.java:
##########
@@ -1014,6 +1015,15 @@ private void deleteAllBackupWALFiles(Configuration conf, 
String backupWalDir)
           System.out.println("Deleted all contents under WAL directory: " + 
walDir);
         }
 
+        // Delete contents under bulk load directory
+        if (fs.exists(bulkloadDir)) {
+          FileStatus[] bulkContents = fs.listStatus(bulkloadDir);
+          for (FileStatus item : bulkContents) {
+            fs.delete(item.getPath(), true); // recursive delete of each child
+          }
+          System.out.println("Deleted all contents under Bulk Load directory: 
" + bulkloadDir);

Review Comment:
   if the required directory isn't exist, e.g. the first using backup, what 
would that be ? will we create it?
   
   the problem is that this list and per-call in s3 could be experience , 
that's why S3A has the feature of `fs.s3a.multiobjectdelete.enable`  (with 
fs.delete path recursively), if we do this way here, and if the list of 
bulkloaded files are huge within the bulkload directory, you will hit s3 
throttling very easily, and we will need to write/folk the same implementation 
here.
   
   so, instead of having the same required directory structure in place, I 
suggested you handle it as an enable/reenable check for the required directory 
structure, and create it to avoid large mount of per-object delete call. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-29524 Handle bulk-loaded HFiles in delete and cleanup process [hbase]

Reply via email to