errose28 commented on PR #8405: URL: https://github.com/apache/ozone/pull/8405#issuecomment-2863715898
@sodonnel based on your comments I have another proposal to handle this issue. I can write that up in this doc as well so we can compare. The current proposal mixes a degraded volume state with a sort of volume decommissioning feature. The later is where most of the complexity comes from. As an initial change, we can make the degraded state purely a sort of alert that shows up via metrics, CLI, Recon, etc when a volume is experiencing numerous IO errors but is still reachable. The state does not need to be persisted in this case. At a later time, we can add volume decommissioning as a separate feature, which would handle persistence of the decom state, space calculation, moving data, and all that work similar to full datanode decommissioning. We could optionally add a config to have the system automatically decom degraded volumes. However, in this proposal volume decommissioning would be left as a future improvement, and the current scope of work would just be about flagging a degraded state for volumes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org