The engineers from DiDi confirmed that there are the same amount of RaftGroup directories on the Datanode disk. It would be nice if Datanode can recognize these RaftGroup are no longer valid, and do some cleanup after that.
On Tue, 9 Jul 2024 at 02:27, Duong Nguyen <du...@cloudera.com.invalid> wrote: > Hi folks, > > "- Reported that there are hundreds of RaftServer objects on Datanodes. > Need further investigation." > I saw this once but didn't get to the bottom. There are likely hundreds of > uncleaned RaftGroup directories under the raft log location. I guess > there's a potential problem with cleaning RaftGroup directories when > closing. > > Thanks, > Duong > > On Fri, Jul 5, 2024 at 12:32 AM Sammi Chen <sammic...@apache.org> wrote: > > > Attenders: Hao, Mingyu, JIanghua, Xi, Sammi > > > > Jianghua: > > - https://github.com/apache/ozone/pull/6886 reuse RunningDatanodeState > in > > datanode > > - Reported that there are hundreds of RaftServer objects on Datanodes. > Need > > further investigation. > > > > Hao/Mingyu: > > - Discussed one optimization of container balancer target datanode > > selection criteria. If the policy is > > - Observed high tail latency(>1s) of getBlock request due to LOCK in > > ChunkUtils#processFileExclusively. ReentrantReadWriteLock like > alternative > > solution is discussed. > > > > Xi > > - Since Ratis 3.1.0 is released. Will try to cherry-pick all existing > > commits which targets for 1.4.1 to Ozone 1.4.1 release branch in the > > following weeks. > > - Discussed the idea to support S3 storage class feature. > > >