adoroszlai opened a new pull request #7: HDDS-1228. Chunk Scanner Checkpoints URL: https://github.com/apache/hadoop-ozone/pull/7 ## What changes were proposed in this pull request? Save timestamp of last successful data scan for each container (in the `.container` file). After a datanode restart, resume data scanning with the container that was least recently scanned. Newly closed containers have no timestamp and are thus scanned first during the next iteration. This will be changed in [HDDS-1369](https://issues.apache.org/jira/browse/HDDS-1369), which proposes to scan newly closed containers immediately. https://issues.apache.org/jira/browse/HDDS-1228 ## How was this patch tested? Created and closed containers. Restarted datanode while scanning was in progress. Verified that after the restart, scanner resumed from the container where it was interrupted. ``` datanode_1 | STARTUP_MSG: Starting HddsDatanodeService datanode_1 | 2019-10-08 19:37:07 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned never datanode_1 | 2019-10-08 19:37:07 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:37:07.570Z datanode_1 | 2019-10-08 19:37:07 INFO ContainerDataScanner:122 - Completed an iteration of container data scrubber in 0 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 1, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 19:37:17 DEBUG ContainerDataScanner:148 - Scanning container 2, last scanned never datanode_1 | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:155 - Completed scan of container 2 at 2019-10-08T19:38:57.402Z datanode_1 | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned at 2019-10-08T19:37:07.570Z datanode_1 | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:38:57.443Z datanode_1 | 2019-10-08 19:38:57 INFO ContainerDataScanner:122 - Completed an iteration of container data scrubber in 1 minutes. Number of iterations (since the data-node restart) : 2, Number of containers scanned in this iteration : 2, Number of unhealthy containers found in this iteration : 0 datanode_1 | 2019-10-08 19:38:57 DEBUG ContainerDataScanner:148 - Scanning container 3, last scanned never datanode_1 | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:155 - Completed scan of container 3 at 2019-10-08T19:39:02.402Z datanode_1 | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:148 - Scanning container 4, last scanned never datanode_1 | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:155 - Completed scan of container 4 at 2019-10-08T19:39:02.430Z datanode_1 | 2019-10-08 19:39:02 DEBUG ContainerDataScanner:148 - Scanning container 5, last scanned never datanode_1 | 2019-10-08 19:39:11 ERROR HddsDatanodeService:75 - RECEIVED SIGNAL 15: SIGTERM datanode_1 | STARTUP_MSG: Starting HddsDatanodeService datanode_1 | 2019-10-08 19:39:22 DEBUG ContainerDataScanner:148 - Scanning container 5, last scanned never datanode_1 | 2019-10-08 19:40:18 DEBUG ContainerDataScanner:155 - Completed scan of container 5 at 2019-10-08T19:40:18.268Z datanode_1 | 2019-10-08 19:40:18 DEBUG ContainerDataScanner:148 - Scanning container 6, last scanned never datanode_1 | 2019-10-08 19:40:31 DEBUG ContainerDataScanner:155 - Completed scan of container 6 at 2019-10-08T19:40:31.735Z datanode_1 | 2019-10-08 19:40:31 DEBUG ContainerDataScanner:148 - Scanning container 2, last scanned at 2019-10-08T19:38:57.402Z datanode_1 | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:155 - Completed scan of container 2 at 2019-10-08T19:42:12.128Z datanode_1 | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:148 - Scanning container 1, last scanned at 2019-10-08T19:38:57.443Z datanode_1 | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:155 - Completed scan of container 1 at 2019-10-08T19:42:12.140Z datanode_1 | 2019-10-08 19:42:12 DEBUG ContainerDataScanner:148 - Scanning container 3, last scanned at 2019-10-08T19:39:02.402Z datanode_1 | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:155 - Completed scan of container 3 at 2019-10-08T19:42:16.629Z datanode_1 | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:148 - Scanning container 4, last scanned at 2019-10-08T19:39:02.430Z datanode_1 | 2019-10-08 19:42:16 DEBUG ContainerDataScanner:155 - Completed scan of container 4 at 2019-10-08T19:42:16.669Z datanode_1 | 2019-10-08 19:42:16 INFO ContainerDataScanner:122 - Completed an iteration of container data scrubber in 2 minutes. Number of iterations (since the data-node restart) : 1, Number of containers scanned in this iteration : 6, Number of unhealthy containers found in this iteration : 0 ``` Also tested upgrade from Ozone 0.4.0. (Downgrade does not work, see [HDDS-2268](https://issues.apache.org/jira/browse/HDDS-2268).)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org