Andrew Wong created KUDU-2372:
---------------------------------

             Summary: Don't let kudu start up if any disks are mounted read-only
                 Key: KUDU-2372
                 URL: https://issues.apache.org/jira/browse/KUDU-2372
             Project: Kudu
          Issue Type: Improvement
          Components: fs
            Reporter: Andrew Wong


Today, if a Kudu tserver runs into EROFS (read-only mount error), it treats the 
error as it would a complete disk failure (EIO), allowing successful startup of 
the server, but failing the tablets that are configured to use the "failed" 
disk.

If something is wrong with the mounting of a disk, it might be helpful to bring 
immediate attention to it, and have operators deal with it, rather than 
handling it automatically. As such, it might be helpful to prevent Kudu from 
starting up if errors are detected with the mount configurations.

There are tradeoffs here to be considered:
 * The current behavior, as it is today, will evict and delete the data from 
the failed tablets, as it is treated as an unrecoverable failure. The user can 
ignore such failures and handle it at their leisure, since Kudu will 
re-replicate the tablets lost in this way
 * If we were to instead crash, this gives operators some immediate feedback 
and a time limit to use `kudu fs update_dirs` to remove the read only drive, or 
maybe fix the mountpoint itself



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to