Hi guys,
   When we use TransactionalEntryLogCompactor to compact the entry log
files in garbage collection, it will generate a lot of small entry log
files, and for those files, the file usage is usually greater than
90%, which can not be compacted unless the file usage decreased. As
time goes on, the number of small entry log files will increase and
bring heavy pressure on heap memory to store entryLogMetaMap and also
bring heavy pressure on RocksDB to store the index.

   After we change the TransactionalEntryLogCompactor to
EntryLogCompactor, it will not generate new small entry log files, but
those old small entry log files can't be compacted.

   The root cause of those small entry log files that can't be
compacted is that we only check the file usage before compaction. If
the file usage is below the threshold, this entry log file will be
compacted, otherwise, it will be skipped in garbage collection.

   I want to introduce the entry log file size checking. If the entry
log file total size is below the specific percentage of the max entry
log file size, we will compact this entry log file no matter whether
the file usage is greater than the threshold or not. The percentage
can be configured in conf/bk_server.conf, and it will be set to 0.0,
which means disabling the file size checking.

   The PR: https://github.com/apache/bookkeeper/pull/3631

   Do you have any ideas?

   Thanks,
   Hang

Reply via email to