siddhantsangwan opened a new pull request, #8492: URL: https://github.com/apache/ozone/pull/8492
## What changes were proposed in this pull request? This pull request is for implementing a part of the design proposed in [HDDS-12929](https://issues.apache.org/jira/browse/HDDS-12929). This only contains the implementation for detecting a full volume, getting the latest storage report, adding the container action, then immediately triggering (or throttling) a heartbeat. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-13045 ## How was this patch tested? Modified existing unit tests. Also did some manual testing using the ozone docker compose cluster. a. Simulated a close to full volume with a capacity of 2 GB, available space of 150 MB and min free space of 100 MB. Datanode log: ``` 2025-05-20 09:47:05,899 [main] INFO volume.HddsVolume: HddsVolume: { id=DS-64dd669c-71fe-492f-903c-4fc7dbe4440a dir=/data/hdds/hdds type=DISK capacity=2147268899 used=1990197248 available=157071651 minFree=104857600 committed=0 } ``` b. Wrote 100 MB of data using freon, with the expectation that an immediate heartbeat will be triggered as soon as the available space drops to 100 MB. Datanode log shows that this happened at 09:50:52: ``` 2025-05-20 09:50:52,028 [f8714dd7-31fc-4c63-9703-6fdb1a59b5c4-ChunkWriter-7-0] INFO impl.HddsDispatcher: Triggering heartbeat for full volume /data/hdds/hdds, with node report storageReport { storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0" storageLocation: "/data/hdds/hdds" capacity: 2147268899 scmUsed: 2042626048 remaining: 104642851 storageType: DISK failed: false committed: 0 freeSpaceToSpare: 104857600 } metadataStorageReport { storageLocation: "/data/metadata/ratis" storageType: DISK capacity: 2147268899 scmUsed: 1990197248 remaining: 157071651 failed: false } ``` c. In the SCM, the last storage report _BEFORE_ the write operation was received at 09:50:09: ``` 2025-05-20 09:50:09,399 [IPC Server handler 12 on default port 9861] INFO server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport { storageUuid: "DS-27210be2-ee53-4035-a3a3-63ec8a162456" storageLocation: "/data/hdds/hdds" capacity: 2147268899 scmUsed: 1990197248 remaining: 157071651 storageType: DISK failed: false committed: 0 freeSpaceToSpare: 104857600 } metadataStorageReport { storageLocation: "/data/metadata/ratis" storageType: DISK capacity: 2147268899 scmUsed: 1990197248 remaining: 157071651 failed: false } ``` So, the next storage report should be received a minute later at 09:51:09, unless it's triggered immediately due to volume full. The SCM log shows that the immediately triggered report was received at 09:50:52, corresponding to the DN log: ``` 2025-05-20 09:50:52,033 [IPC Server handler 4 on default port 9861] INFO server.SCMDatanodeHeartbeatDispatcher: Dispatching Node Report storageReport { storageUuid: "DS-bd34474b-8fd4-49be-be78-72e708b543c0" storageLocation: "/data/hdds/hdds" capacity: 2147268899 scmUsed: 2042626048 remaining: 104642851 storageType: DISK failed: false committed: 0 freeSpaceToSpare: 104857600 } metadataStorageReport { storageLocation: "/data/metadata/ratis" storageType: DISK capacity: 2147268899 scmUsed: 1990197248 remaining: 157071651 failed: false } ``` The next storage report is received at the expected time of 09:51:09, showing that throttling also worked. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org