Re: [PR] Add instance name to HDFS [accumulo]

via GitHub Tue, 15 Apr 2025 13:43:49 -0700


ctubbsii commented on code in PR #5458:
URL: https://github.com/apache/accumulo/pull/5458#discussion_r2045448783



##########
server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManager.java:
##########
@@ -207,28 +208,51 @@ default Volume getFirst() {
 
   Logger log = LoggerFactory.getLogger(VolumeManager.class);
 
-  static InstanceId getInstanceIDFromHdfs(Path instanceDirectory, 
Configuration hadoopConf) {
+  static InstanceInfo getInstanceInfoFromHdfs(Path instanceDirectory, 
Configuration hadoopConf) {
     try {
-      FileSystem fs =
-          VolumeConfiguration.fileSystemForPath(instanceDirectory.toString(), 
hadoopConf);
+      log.debug("Trying to read instance info from {}", instanceDirectory);
+      var fs = 
VolumeConfiguration.fileSystemForPath(instanceDirectory.toString(), hadoopConf);
       FileStatus[] files = null;
       try {
         files = fs.listStatus(instanceDirectory);
       } catch (FileNotFoundException ex) {
         // ignored
       }
-      log.debug("Trying to read instance id from {}", instanceDirectory);
-      if (files == null || files.length == 0) {
-        log.error("unable to obtain instance id at {}", instanceDirectory);
-        throw new IllegalStateException(
-            "Accumulo not initialized, there is no instance id at " + 
instanceDirectory);
-      } else if (files.length != 1) {
-        log.error("multiple potential instances in {}", instanceDirectory);
+      InstanceId instanceId = null;

Review Comment:
   That's what I originally was going to do. However, the current design is to 
rely only on the filename/directory contents, rather than try to read any 
files. This reduces the amount of round trip RPC calls, because there is no 
need to talk to any HDFS DataNode, and is therefore much more robust against 
whole classes of cluster failure scenarios. That's the current design, so I 
kept the same goal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add instance name to HDFS [accumulo]

Reply via email to