Ajmal Ahammed created HDFS-15597: ------------------------------------ Summary: ContentSummary.getSpaceConsumed does not consider replication Key: HDFS-15597 URL: https://issues.apache.org/jira/browse/HDFS-15597 Project: Hadoop HDFS Issue Type: Bug Components: dfs Affects Versions: 2.6.0 Reporter: Ajmal Ahammed
I am trying to get the disk space consumed by an HDFS directory using the {{ContentSummary.getSpaceConsumed}} method. I can't get the space consumption correctly considering the replication factor. The replication factor is 2, and I was expecting twice the size of the actual file size from the above method. I can't get the space consumption correctly considering the replication factor. The replication factor is 2, and I was expecting twice the size of the actual file size from the above method. {code} ubuntu@ubuntu:~/ht$ sudo -u hdfs hdfs dfs -ls /var/lib/ubuntu Found 2 items -rw-r--r-- 2 ubuntu ubuntu 3145728 2020-09-08 09:55 /var/lib/ubuntu/size-test drwxrwxr-x - ubuntu ubuntu 0 2020-09-07 06:37 /var/lib/ubuntu/test {code} But when I run the following code, {code} String path = "/etc/hadoop/conf/"; conf.addResource(new Path(path + "core-site.xml")); conf.addResource(new Path(path + "hdfs-site.xml")); long size = FileContext.getFileContext(conf).util().getContentSummary(fileStatus).getSpaceConsumed(); System.out.println("Replication : " + fileStatus.getReplication()); System.out.println("File size : " + size); {code} The output is {code} Replication : 0 File size : 3145728 {code} Both the file size and the replication factor seems to be incorrect. /etc/hadoop/conf/hdfs-site.xml contains the following config: {code} <property> <name>dfs.replication</name> <value>2</value> </property> {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org