ctubbsii commented on code in PR #5458:
URL: https://github.com/apache/accumulo/pull/5458#discussion_r2045448783
##########
server/base/src/main/java/org/apache/accumulo/server/fs/VolumeManager.java:
##########
@@ -207,28 +208,51 @@ default Volume getFirst() {
Logger log = LoggerFactory.getLogger(VolumeManager.class);
- static InstanceId getInstanceIDFromHdfs(Path instanceDirectory,
Configuration hadoopConf) {
+ static InstanceInfo getInstanceInfoFromHdfs(Path instanceDirectory,
Configuration hadoopConf) {
try {
- FileSystem fs =
- VolumeConfiguration.fileSystemForPath(instanceDirectory.toString(),
hadoopConf);
+ log.debug("Trying to read instance info from {}", instanceDirectory);
+ var fs =
VolumeConfiguration.fileSystemForPath(instanceDirectory.toString(), hadoopConf);
FileStatus[] files = null;
try {
files = fs.listStatus(instanceDirectory);
} catch (FileNotFoundException ex) {
// ignored
}
- log.debug("Trying to read instance id from {}", instanceDirectory);
- if (files == null || files.length == 0) {
- log.error("unable to obtain instance id at {}", instanceDirectory);
- throw new IllegalStateException(
- "Accumulo not initialized, there is no instance id at " +
instanceDirectory);
- } else if (files.length != 1) {
- log.error("multiple potential instances in {}", instanceDirectory);
+ InstanceId instanceId = null;
Review Comment:
That's what I originally was going to do. However, the current design is to
rely only on the filename/directory contents, rather than try to read any
files. This reduces the amount of round trip RPC calls, because there is no
need to talk to any HDFS DataNode, and is therefore much more robust against
whole classes of cluster failure scenarios. That's the current design, so I
kept the same goal.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]