Hi teams. Current scheduling mechanism for garbage collection uses
scheduleAtFixedRate. This approach schedules the next execution without
considering whether the current task has finished, potentially leading to
overlapping executions if a task takes longer than expected.

In my test environment, After task accumulation in gc thread pool,
sometimes there is no entrylog to extract and no entrylogger to compact.
But every round of gc, still need to compare ledger meta between local and
metadata store (zk), which will result in highly frequently access to
metadata store and each access will bring considerable unnecessary data
flow.

I propose changing this to use scheduleWithFixedDelay instead, which
ensures that there is a fixed delay period between the end of one execution
and the start of the next.
Thanks
ZhangJian He

Reply via email to