Hi BookKeeper Community, I’d like to propose a modification to how garbage collection (GC) handles disk-full scenarios. Currently, when any ledger disk reaches full capacity, suspendMajorGC()/suspendMinorGC() pauses GC for all disks. This behavior can unnecessarily impact healthy disks, especially in cases of uneven disk utilization.
Consider two scenarios: 1. Even Data Distribution: All disks are nearly full, and one fills up first. Temporarily disabling GC only on the full disk (before propagating suspension to others) is safe. 2. Uneven Data Distribution: Due to write skew or cleanup inconsistencies, a single disk may fill up while others still have free space. Halting GC globally penalizes operational disks. To address this, I propose three solutions: Option 1: Reuse isReadOnlyModeOnAnyDiskFullEnabled. When isReadOnlyModeOnAnyDiskFullEnabled == true, stop GC on all disks; otherwise, other disks should continue normal operations without GC suspension. Reason: isReadOnlyModeOnAnyDiskFullEnabled reflects the user’s intent about whether to stop all bookie writes when any single disk is full, but GC might need to create new files for writing data ahead of cleanup. Option 2: When a single disk becomes full, only stop GC for that specific disk. Other disks should continue their GC processes uninterrupted. Reason: This issue should be treated as a bug fix rather than a breaking change. No configuration is needed; simply fix the current behavior. Option 3: Add a new configuration to control whether to stop GC on other disks when any single disk becomes full. Reason: This does not change the existing behavior but allows users to configure it according to their needs. I think Option 2 is the most appropriate, as it directly addresses the problem without introducing additional configuration complexity. Looking forward to your feedback. BR, Xiangying