> As we work on telecom data records (voice call/sms/GPRS xDRs), the data > volume is simply HUGE, and we definitely need a “controlled” caching > mechanism in front of the Cassandra layer.
What's huge? Number of gigs, ballpark. > By the term “controlled cache layer”, what I am trying to suggest is > something like maybe maintaining a list of most high-usage (and therefore, > high occurrence) phone numbers somewhere, and the cache layer will hold all > live data and counters for those numbers in memory. Therefore, all The cassandra row-cache is LRU, and the page cache of OS:es is "LRU:ish" (but generally you might see evictions at any time when unlucky). If you use an external cache, keep in mind that you instantly have the problem that the cache can become inconsistent with data in Cassandra. You may also want to wait for the off-heap row cache support to be in a released version to be more efficient w.r.t. memory usage and GC overhead than the normal row caching behavior. But before asking what the appropriate external cache is, make sure you actually do need one first since the lack of guaranteed consistency with the Cassandra cluster is usually something that is nice to avoid. -- / Peter Schuller (@scode on twitter)