Hi,
We currently have a cache[1] in gandiva that caches the built projector or 
filter module with LRU based eviction policy. However since the cost of 
building different expressions is not uniform it makes sense to have a 
different eviction policy that takes into account an associated cost of a cache 
miss (while also discounting the items which have not been recently used). We 
are planning to use an algorithm called GreedyDual-Size Algorithm [2] which 
seems fit for the purpose. The algorithm is quite simple -
Each item has a cost (build time in our case) and item with lowest cost (c_min) 
is evicted. All other items cost are deducted by (c_min)
On cache hit, the item cost is restored to the original value

This can be implemented using a priority queue and an efficient implementation 
of this can handle both cache hit and eviction in O(logk) time.

Does anybody have any other suggestions or ideas on this?

[1] https://github.com/apache/arrow/blob/master/cpp/src/gandiva/cache.h 
<https://github.com/apache/arrow/blob/master/cpp/src/gandiva/cache.h>
[2] 
https://www.usenix.org/legacy/publications/library/proceedings/usits97/full_papers/cao/cao_html/node8.html
 
<https://www.usenix.org/legacy/publications/library/proceedings/usits97/full_papers/cao/cao_html/node8.html>

Regards,
Projjal

Reply via email to