Hi All,

out of disk handler

I propose to enhance CouchDB to monitor disk occupancy and react automatically 
as free space becomes scarce. I've written a working prototype at: 
https://github.com/apache/couchdb/compare/main...out-of-disk-handler

The `diskmon` application is part of Erlang/OTP and I suggest we use that as 
the base, since it supports all the platforms we support (and a few more).

The patch reacts differently depending on whether it is database_dir or 
view_index_dir that runs out of space (of course they might both run out of 
space at the same time in the common case that the same device is used for 
both), namely;

1) Clustered database updates are prohibited (a 507 Insufficient Storage error 
is returned)
2) Background indexing is suspended (no new jobs will be started)
3) Querying a stale view is prohibited (a 507 Insufficient Storage error is 
returned)
4) Querying an up-to-date view is permitted

The goal being to leave internal replication running (to avoid data loss) and 
compaction (as the only action that reduces disk occupancy). I can see adding 
an option to suspend _all_ writing at, say, 99% full, in order to avoid hitting 
actual end of disk, but have not coded this up in the branch so far.

At the moment these all activate at once, which I think is not how we want to 
do this.

I suggest that we have configuration options for;

1) a global toggle to activate the out of disk handler
2) a parameter for the used disk percentage of view_index_dir at which we 
suspend background indexing, defaulting to 80
3) a parameter for the used disk percentage of view_index_dir at which we 
refuse to update stale indexes, defaulting to 90
4) a parameter for the used disk percentage of database_dir at which we suspend 
writes, defaulting to 90.

What do we all think?


B.

Reply via email to