Good point! I think we can run the TPC-DS benchmark multiple times and wait 
until the LLAP cache has sufficiently cached the data onto the SSD. Then, we 
can observe whether the test performance improves. If I remember correctly, 
LLAP has a page where you can check the cache hit rate.

Thanks,
Butao Zhang

On 2025/09/08 09:12:59 Denys Kuzmenko wrote:
> hi Sungwoo,
> 
> I don’t believe the TPC-DS benchmark is the best way to demonstrate the 
> advantages of Hive LLAP’s distributed cache. 
> TPC-DS is primarily designed to measure query optimization and overall system 
> performance across a wide variety of complex workloads, but it doesn’t 
> necessarily highlight scenarios where LLAP’s in-memory caching of frequently 
> accessed data provides clear benefits. 
> A more targeted benchmark or workload that emphasizes repeated access to the 
> same datasets would be a better fit to showcase the strengths of LLAP’s 
> distributed caching capabilities.
> 
> Regards,
> Denys
> 

Reply via email to