In a typical bookkeeper deployment, SSD disks are used to store Journal log
data, while HDD disks are used to store Ledger data. Data writes are
initially stored in memory and then asynchronously flushed to the HDD disk
in the background. However, due to memory limitations, the amount of data
that can be cached is restricted. Consequently, requests for historical
data ultimately rely on the HDD disk, which becomes a bottleneck for the
entire Bookkeeper cluster. Moreover, during data recovery processes
following node failures, a substantial amount of historical data needs to
be read from the HDD disk, leading to the disk's I/O utilization reaching
maximum capacity and resulting in significant read request delays or
failures.

To address these challenges, a new architecture is proposed: the
introduction of a disk cache between the memory cache and the HDD disk,
utilizing an SSD disk as an intermediary medium to significantly extend
data caching duration. The data flow is as follows: journal -> write cache
-> SSD cache -> HDD disk. The SSD disk cache functions as a regular
LedgerStorage layer and is compatible with all existing LedgerStorage
implementations. The following outlines the process:

   1. Data eviction from the disk cache to the Ledger data disk occurs on a
   per-log file basis.
   2. A new configuration parameter, diskCacheRetentionTime, is added to
   set the duration for which hot data is retained. Files with write
   timestamps older than the retention time will be evicted to the Ledger data
   disk.
   3. A new configuration parameter, diskCacheThreshold, is added. If the
   disk cache utilization exceeds the threshold, the eviction process is
   accelerated. Data is evicted to the Ledger data disk based on the order of
   file writes until the disk space recovers above the threshold.
   4. A new thread, ColdStorageArchiveThread, is introduced to periodically
   evict data from the disk cache to the Ledger data disk.

Reply via email to