[jira] [Commented] (HDFS-13538) HDFS DiskChecker should handle disk full situation

ASF GitHub Bot (Jira) Thu, 28 Aug 2025 23:32:12 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016950#comment-18016950
 ]


ASF GitHub Bot commented on HDFS-13538:
---------------------------------------

shkhrgpt opened a new pull request, #7918:
URL: https://github.com/apache/hadoop/pull/7918

   <!--
     Thanks for sending a pull request!
       1. If this is your first time, please read our contributor guidelines: 
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
       2. Make sure your PR title starts with JIRA issue id, e.g., 
'HADOOP-17799. Your PR title ...'.
   -->
   
   ### Description of PR
   
   DiskChecker was enhanced to do checking with some i/o in 
[HADOOP-13738](https://issues.apache.org/jira/browse/HADOOP-13738). But this 
was rolled back partially because of fsync issue seen in 
[HADOOP-15450](https://issues.apache.org/jira/browse/HADOOP-15450) and the 
problem of disk full being flagged as a check failure 
([HDFS-13538](https://issues.apache.org/jira/browse/HDFS-13538)).
   
   This PR tries to address 
[HDFS-13538](https://issues.apache.org/jira/browse/HDFS-13538) and enable i/o 
based disk checking ONLY for HDFS with a flag that can be turned on in dfs 
configs.
   
   ### How was this patch tested?
   
   ```
   [INFO] -------------------------------------------------------
   [INFO]  T E S T S
   [INFO] -------------------------------------------------------
   [INFO] Running org.apache.hadoop.util.TestDiskCheckerWithDiskIo
   [INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.129 
s 

> HDFS DiskChecker should handle disk full situation
> --------------------------------------------------
>
>                 Key: HDFS-13538
>                 URL: https://issues.apache.org/jira/browse/HDFS-13538
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Arpit Agarwal
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: HDFS-13538.01.patch
>
>
> Fix disk checker issues reported by [~kihwal] in HADOOP-13738:
> When space is low, the os returns ENOSPC. Instead simply stop writing, the 
> drive is marked bad and replication happens. This make cluster-wide space 
> problem worse. If the number of "failed" drives exceeds the DFIP limit, the 
> datanode shuts down.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13538) HDFS DiskChecker should handle disk full situation

Reply via email to