[PR] HDDS-13045. Implement Immediate Triggering of Heartbeat when Volume Full [ozone]

via GitHub Tue, 20 May 2025 05:02:13 -0700


siddhantsangwan opened a new pull request, #8492:
URL: https://github.com/apache/ozone/pull/8492


   ## What changes were proposed in this pull request?
   
   This pull request is for implementing a part of the design proposed in 
[HDDS-12929](https://issues.apache.org/jira/browse/HDDS-12929). This only 
contains the implementation for detecting a full volume, getting the latest 
storage report, adding the container action, then immediately triggering (or 
throttling) a heartbeat.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-13045
   
   ## How was this patch tested?
   
   Modified existing unit tests. Also did some manual testing using the ozone 
docker compose cluster.
   
   a. Simulated a close to full volume with a capacity of 2 GB, available space 
of 150 MB and min free space of 100 MB. Datanode log:
   ```
   2025-05-20 09:47:05,899 [main] INFO volume.HddsVolume: HddsVolume: { 
id=DS-64dd669c-71fe-492f-903c-4fc7dbe4440a dir=/data/hdds/hdds type=DISK 
capacity=2147268899 used=1990197248 available=157071651 minFree=104857600 
committed=0 }
   ```
   
   b. Wrote 100 MB of data using freon, with the expectation that an immediate 
heartbeat will be triggered as soon as the available space drops to 100 MB. 
Datanode log shows that this happened at 09:50:52:
   ```
   2025-05-20 09:50:52,028 
[f8714dd7-31fc-4c63-9703-6fdb1a59b5c4-ChunkWriter-7-0] INFO 
impl.HddsDispatcher: Triggering heartbeat for full volume /data/hdds/hdds, with 
node report storageReport {
      storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0"
      storageLocation: "/data/hdds/hdds"
      capacity: 2147268899
      scmUsed: 2042626048
      remaining: 104642851
      storageType: DISK
      failed: false
      committed: 0
      freeSpaceToSpare: 104857600
    }
    metadataStorageReport {
      storageLocation: "/data/metadata/ratis"
      storageType: DISK
      capacity: 2147268899
      scmUsed: 1990197248
      remaining: 157071651
      failed: false
    }
   ```
   
   c. In the SCM, the last storage report _BEFORE_ the write operation was 
received at 09:50:09:
   ```
   2025-05-20 09:50:09,399 [IPC Server handler 12 on default port 9861] INFO 
server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport {
   storageUuid: "DS-27210be2-ee53-4035-a3a3-63ec8a162456"
      storageLocation: "/data/hdds/hdds"
      capacity: 2147268899
      scmUsed: 1990197248
      remaining: 157071651
      storageType: DISK
      failed: false
      committed: 0
      freeSpaceToSpare: 104857600
    }
    metadataStorageReport {
      storageLocation: "/data/metadata/ratis"
      storageType: DISK
      capacity: 2147268899
      scmUsed: 1990197248
      remaining: 157071651
      failed: false
    }
   ```
   So, the next storage report should be received a minute later at 09:51:09, 
unless it's triggered immediately due to volume full. The SCM log shows that 
the immediately triggered report was received at 09:50:52, corresponding to the 
DN log:
   ```
   2025-05-20 09:50:52,033 [IPC Server handler 4 on default port 9861] INFO 
server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport {
      storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0"
      storageLocation: "/data/hdds/hdds"
      capacity: 2147268899
      scmUsed: 2042626048
      remaining: 104642851
      storageType: DISK
      failed: false
      committed: 0
      freeSpaceToSpare: 104857600
    }
    metadataStorageReport {
      storageLocation: "/data/metadata/ratis"
      storageType: DISK
      capacity: 2147268899
      scmUsed: 1990197248
      remaining: 157071651
      failed: false
    }
   ```
   The next storage report is received at the expected time of 09:51:09, 
showing that throttling also worked.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org
For additional commands, e-mail: issues-h...@ozone.apache.org

[PR] HDDS-13045. Implement Immediate Triggering of Heartbeat when Volume Full [ozone]

Reply via email to