Forgot one thing ... I'd like to share a design document regarding the Row Cache for HBase 2.x/3.x for (potential) further discussion:
https://docs.google.com/document/d/1Ag0cej2X0qNBb2HMVOJGiQdDRntqDW5n/edit?usp=sharing&ouid=107898024699489289958&rtpof=true&sd=true Best regards, Vladimir Rodionov On Mon, Mar 16, 2026 at 11:37 AM Vladimir Rodionov <[email protected]> wrote: > > Hi Xiao, > > Thank you very much for the detailed response and for sharing your > experience. It is very interesting to hear that you implemented > a RowCache-like mechanism internally and have been running it in > production at large scale. CPU reduction under the same traffic is > exactly the type of benefit > I was hoping to see from logical row caching. From an architectural > perspective, RowCache design takes a somewhat different approach from > HBASE-29585. > The implementation is a coprocessor and therefore requires no changes > to the HBase core code path, which allows it to be deployed and > evaluated independently. > It also uses a cache engine (can be made easily pluggable) designed to > store a very large number of small objects with minimal metadata > overhead. > > One aspect that I found important while experimenting with row-level > caching is the scalability of the cache storage layer when the cache > contains a very large > number of small entries. In many HBase workloads rows are relatively > small, which means the cache may contain millions or even hundreds of > millions > of objects. In such scenarios the per-entry metadata overhead becomes > an important factor. For this reason RowCache implementation uses a > cache engine > optimized for compact storage of large numbers of small objects (it > even creates common compression dictionaries and applies them to > objects during > compression reducing memory footprint even more). This was one of the > motivations for not reusing BucketCache directly, since its metadata > overhead was originally designed around block-sized entries rather > than row-sized objects. It will be interesting to see how different > approaches behave > as cache entry counts grow to very large scales. > > My understanding is that HBASE-29585 reuses BucketCache as the storage > layer for row objects. While that approach has the advantage of > integrating > with existing infrastructure, it may face scalability challenges when > the cache stores extremely large numbers of small entries, since > BucketCache metadata > overhead becomes more significant at that granularity. It will be > interesting to see how this behaves in large production deployments. > > In any case, it is very encouraging to see that similar ideas are > being explored and successfully used in production systems such as > yours, > OceanBase, and XEngine. Logical row caching appears to be a useful > complement to the existing HBase caching layers. > > Thank you again for sharing your experience, and I look forward to > continuing the discussion. > > Best regards, > Vladimir > > On Sun, Mar 15, 2026 at 8:37 PM Xiao Liu <[email protected]> wrote: > > > > Thanks, Vladimir! > > > > In fact, by 2025, in some of our production use cases—such as features and > > tags, they use bulkload for offline data loading and multi-get for > > real-time point queries to provide services to external users. In this > > scenario, we’ve found that cache utilization is not very efficient. > > > > We have also researched similar systems. As you mentioned, many of them > > have implemented RowCache. Other examples include Oceanbase[1] and > > XEngine[2] used by PolarDB. > > > > We have implemented a RowCache based on these references and deployed it in > > our production environment. In our benchmark tests, Get throughput improved > > under the same resource conditions. In actual production use, CPU > > utilization dropped significantly under the same traffic load. After > > running in production for nearly six months and handling tens of millions > > of requests per second, it has proven to be a viable solution and a > > valuable complement to HBase use cases. > > > > During practical implementation, we encountered several challenges, > > including: > > 1. Ensuring data consistency in bulk load scenarios > > 2. Maintaining cache consistency during automatic region balancing > > 3. Performance under high filterRead loads > > 4. Business scenarios where Get.addColumn is used to read data from a > > specific cell rather than an entire row > > 5. Handling massive table data volumes with a 7-day expiration policy > > ... > > > > In summary, thank you very much for proposing and open-sourcing a solution; > > we will study it in depth. At the same time, as Duo mentioned, we are very > > pleased to see that HBASE-29585 in the community is also being actively > > advanced, and we can work together to drive the implementation of this > > feature in HBase. > > > > Best, > > Xiao Liu > > > > [1] XEngine: > > https://www.alibabacloud.com/help/en/polardb/polardb-for-mysql/user-guide/x-engine-principle-analysis > > [2] Oceanbase: https://www.oceanbase.com/docs > > /common-oceanbase-database-cn-10000000001576547 > > > > > > > > On 2026/01/04 21:02:28 Vladimir Rodionov wrote: > > > Hello HBase community, > > > > > > I’d like to start a discussion around a feature that exists in related > > > systems but is still missing in Apache HBase: row-level caching. > > > > > > Both *Cassandra* and *Google Bigtable* provide a row cache for hot rows. > > > Bigtable recently revisited this area and reported measurable gains for > > > single-row reads. HBase today relies almost entirely on *block cache*, > > > which is excellent for scans and predictable access patterns, but can be > > > inefficient for *small random reads*, *hot rows spanning multiple blocks*, > > > and *cloud / object-store–backed deployments*. > > > > > > To explore this gap, I’ve been working on an *HBase Row Cache for HBase > > > 2.x*, > > > implemented as a *RegionObserver coprocessor*, and I’d appreciate feedback > > > from HBase developers and operators. > > > > > > *Project*: > > > > > > https://github.com/VladRodionov/hbase-row-cache > > > > > > > > > *Background / motivation (cloud focus):* > > > > > > https://github.com/VladRodionov/hbase-row-cache/wiki/HBase:-Why-Block-Cache-Alone-Is-No-Longer-Enough-in-the-Cloud > > > > > > What This Is > > > > > > > > > - > > > > > > Row-level cache for HBase 2.x (coprocessor-based) > > > - > > > > > > Powered by *Carrot Cache* (mostly off-heap, GC-friendly) > > > - > > > > > > Multi-level cache (L1/L2/L3) > > > - > > > > > > Read-through caching of table : rowkey : column-family > > > - > > > > > > Cache invalidation on any mutation of the corresponding row+CF > > > - > > > > > > Designed for *read-mostly, random-access* workloads > > > - > > > > > > Can be enabled per table or per column family > > > - > > > > > > Typically used *instead of*, not alongside, block cache > > > > > > *Block Cache vs Row Cache (Conceptual)* > > > > > > *Aspect* > > > > > > *Block Cache* > > > > > > *Row Cache* > > > > > > Cached unit > > > > > > HFile block (e.g. 64KB) > > > > > > Row / column family > > > > > > Optimized for > > > > > > Scans, sequential access > > > > > > Random small reads, hot rows > > > > > > Memory efficiency for small reads > > > > > > Low (unused data in blocks) > > > > > > High (cache only requested data) > > > > > > Rows spanning multiple blocks > > > > > > Multiple blocks cached > > > > > > Single cache entry > > > > > > Read-path CPU cost > > > > > > Decode & merge every read > > > > > > Amortized across hits > > > > > > Cloud / object store fit > > > > > > Necessary but expensive > > > > > > Reduces memory & I/O amplification > > > > > > Block cache remains essential; row cache targets a *different optimization > > > point*. > > > > > > *Non-Goals (Important)* > > > > > > > > > - > > > > > > Not proposing removal or replacement of block cache > > > - > > > > > > Not suggesting this be merged into HBase core > > > - > > > > > > Not targeting scan-heavy or sequential workloads > > > - > > > > > > Not eliminating row reconstruction entirely > > > - > > > > > > Not optimized for write-heavy or highly mutable tables > > > - > > > > > > Not changing HBase storage or replication semantics > > > > > > This is an *optional optimization* for a specific class of workloads. > > > > > > *Why I’m Posting* > > > > > > This is *not a merge proposal*, but a request for discussion: > > > > > > > > > 1. > > > > > > Do you see *row-level caching* as relevant for modern HBase > > > deployments? > > > 2. > > > > > > Are there workloads where block cache alone is insufficient today? > > > 3. > > > > > > Is a coprocessor-based approach reasonable for experimentation? > > > 4. > > > > > > Are there historical or architectural reasons why row cache never > > > landed > > > in HBase? > > > > > > Any feedback—positive or critical—is very welcome. > > > > > > Best regards, > > > > > > Vladimir Rodionov > > >
