[
https://issues.apache.org/jira/browse/KUDU-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414100#comment-16414100
]
Todd Lipcon commented on KUDU-2372:
-----------------------------------
Per KUDU-2359 I think it may make sense to allow starting up with a bad disk so
that we don't need manual intervention after a single disk failure (eg on a
12-disk host)
> Don't let kudu start up if any disks are mounted read-only
> ----------------------------------------------------------
>
> Key: KUDU-2372
> URL: https://issues.apache.org/jira/browse/KUDU-2372
> Project: Kudu
> Issue Type: Improvement
> Components: fs
> Reporter: Andrew Wong
> Priority: Major
>
> Today, if a Kudu tserver runs into EROFS (read-only mount error), it treats
> the error as it would a complete disk failure (EIO), allowing successful
> startup of the server, but failing the tablets that are configured to use the
> "failed" disk.
> If something is wrong with the mounting of a disk, it might be helpful to
> bring immediate attention to it, and have operators deal with it, rather than
> handling it automatically. As such, it might be helpful to prevent Kudu from
> starting up if errors are detected with the mount configurations.
> There are tradeoffs here to be considered:
> * The current behavior, as it is today, will evict and delete the data from
> the failed tablets, as it is treated as an unrecoverable failure. The user
> can ignore such failures and handle it at their leisure, since Kudu will
> re-replicate the tablets lost in this way
> * If we were to instead crash, this gives operators some immediate feedback
> and a time limit to use `kudu fs update_dirs` to remove the read only drive,
> or maybe fix the mountpoint itself
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)