Re: Custom FSError and CommitLog Error Handling

Dinesh Joshi Sun, 17 Dec 2023 22:41:32 -0800

> On Dec 11, 2023, at 11:27 AM, Raymond Huffman <raymondmhuff...@gmail.com> 
> wrote:
> 
> On our fork of Cassandra, we've implemented some custom behavior for handling 
> CommitLog and SSTable Corruption errors. Specifically, if a node detects one 
> of those errors, we want the node to stop itself, and if the node is 
> restarted, we want initialization to fail. This is


This is the correct behavior if you can reliably detect disk / memory failure 
which is usually the cause of corruption.

> FSErrorHandler, and the error handler that's currently implemented at 
> org.apache.cassandra.db.commitlog.CommitLog#handleCommitError via config in 
> the same way one can provide custom Partitioners and 
> Authenticators/Authorizers.

How would you implement this custom FSErrorHandler? Would it significantly vary 
between operators of Cassandra? If improperly implemented it may lead to 
serious outages.

> Would you take as a contribution one of the following?
> 1. user provided implementations of FSErrorHandler and CommitLogErrorHandler, 
> set via config; and/or
> 2. new commit failure and disk failure policies that write a poison pill file 
> to disk and fail on startup if that file exists

Maybe this can be added as feature to Cassandra without a need to customize / 
making it pluggable. It appears to be useful as described. If you have a branch 
with the proposed behavior it might make  it easier to clarify any questions.

Dinesh

Re: Custom FSError and CommitLog Error Handling

Reply via email to