Thanks, Vladimir!

In fact, by 2025, in some of our production use cases—such as features and 
tags, they use bulkload for offline data loading and multi-get for real-time 
point queries to provide services to external users. In this scenario, we’ve 
found that cache utilization is not very efficient.

We have also researched similar systems. As you mentioned, many of them have 
implemented RowCache. Other examples include Oceanbase[1] and XEngine[2] used 
by PolarDB.

We have implemented a RowCache based on these references and deployed it in our 
production environment. In our benchmark tests, Get throughput improved under 
the same resource conditions. In actual production use, CPU utilization dropped 
significantly under the same traffic load. After running in production for 
nearly six months and handling tens of millions of requests per second, it has 
proven to be a viable solution and a valuable complement to HBase use cases.

During practical implementation, we encountered several challenges, including:
1. Ensuring data consistency in bulk load scenarios
2. Maintaining cache consistency during automatic region balancing
3. Performance under high filterRead loads
4. Business scenarios where Get.addColumn is used to read data from a specific 
cell rather than an entire row
5. Handling massive table data volumes with a 7-day expiration policy
...

In summary, thank you very much for proposing and open-sourcing a solution; we 
will study it in depth. At the same time, as Duo mentioned, we are very pleased 
to see that HBASE-29585 in the community is also being actively advanced, and 
we can work together to drive the implementation of this feature in HBase.

Best,
Xiao Liu

[1] XEngine: 
https://www.alibabacloud.com/help/en/polardb/polardb-for-mysql/user-guide/x-engine-principle-analysis
[2] Oceanbase: https://www.oceanbase.com/docs
/common-oceanbase-database-cn-10000000001576547



On 2026/01/04 21:02:28 Vladimir Rodionov wrote:
> Hello HBase community,
> 
> I’d like to start a discussion around a feature that exists in related
> systems but is still missing in Apache HBase: row-level caching.
> 
> Both *Cassandra* and *Google Bigtable* provide a row cache for hot rows.
> Bigtable recently revisited this area and reported measurable gains for
> single-row reads. HBase today relies almost entirely on *block cache*,
> which is excellent for scans and predictable access patterns, but can be
> inefficient for *small random reads*, *hot rows spanning multiple blocks*,
> and *cloud / object-store–backed deployments*.
> 
> To explore this gap, I’ve been working on an *HBase Row Cache for HBase 2.x*,
> implemented as a *RegionObserver coprocessor*, and I’d appreciate feedback
> from HBase developers and operators.
> 
> *Project*:
> 
> https://github.com/VladRodionov/hbase-row-cache
> 
> 
> *Background / motivation (cloud focus):*
> 
> https://github.com/VladRodionov/hbase-row-cache/wiki/HBase:-Why-Block-Cache-Alone-Is-No-Longer-Enough-in-the-Cloud
> 
> What This Is
> 
> 
>    -
> 
>    Row-level cache for HBase 2.x (coprocessor-based)
>    -
> 
>    Powered by *Carrot Cache* (mostly off-heap, GC-friendly)
>    -
> 
>    Multi-level cache (L1/L2/L3)
>    -
> 
>    Read-through caching of table : rowkey : column-family
>    -
> 
>    Cache invalidation on any mutation of the corresponding row+CF
>    -
> 
>    Designed for *read-mostly, random-access* workloads
>    -
> 
>    Can be enabled per table or per column family
>    -
> 
>    Typically used *instead of*, not alongside, block cache
> 
> *Block Cache vs Row Cache (Conceptual)*
> 
> *Aspect*
> 
> *Block Cache*
> 
> *Row Cache*
> 
> Cached unit
> 
> HFile block (e.g. 64KB)
> 
> Row / column family
> 
> Optimized for
> 
> Scans, sequential access
> 
> Random small reads, hot rows
> 
> Memory efficiency for small reads
> 
> Low (unused data in blocks)
> 
> High (cache only requested data)
> 
> Rows spanning multiple blocks
> 
> Multiple blocks cached
> 
> Single cache entry
> 
> Read-path CPU cost
> 
> Decode & merge every read
> 
> Amortized across hits
> 
> Cloud / object store fit
> 
> Necessary but expensive
> 
> Reduces memory & I/O amplification
> 
> Block cache remains essential; row cache targets a *different optimization
> point*.
> 
> *Non-Goals (Important)*
> 
> 
>    -
> 
>    Not proposing removal or replacement of block cache
>    -
> 
>    Not suggesting this be merged into HBase core
>    -
> 
>    Not targeting scan-heavy or sequential workloads
>    -
> 
>    Not eliminating row reconstruction entirely
>    -
> 
>    Not optimized for write-heavy or highly mutable tables
>    -
> 
>    Not changing HBase storage or replication semantics
> 
> This is an *optional optimization* for a specific class of workloads.
> 
> *Why I’m Posting*
> 
> This is *not a merge proposal*, but a request for discussion:
> 
> 
>    1.
> 
>    Do you see *row-level caching* as relevant for modern HBase deployments?
>    2.
> 
>    Are there workloads where block cache alone is insufficient today?
>    3.
> 
>    Is a coprocessor-based approach reasonable for experimentation?
>    4.
> 
>    Are there historical or architectural reasons why row cache never landed
>    in HBase?
> 
> Any feedback—positive or critical—is very welcome.
> 
> Best regards,
> 
> Vladimir Rodionov
> 

Reply via email to