[ https://issues.apache.org/jira/browse/CASSANDRA-5357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13864543#comment-13864543 ]
Jonathan Ellis commented on CASSANDRA-5357: ------------------------------------------- bq. I think we could also do some intelligent sizing of the cache per-CF with the metrics we keep, that would be relatively static (so impervious to churn). I'm not sure what I was thinking here. (Maybe that we'd only need one cached partition per CF which is nonsense.) We do need LRU or similar behavior at a high level, just like we do with the row cache today. The question is, how much of each partition do we cache? I think it's a lot simpler if we decide we'll cache the same amount for each partition in a CF, and not try to be clever and "extend" a cached partition when we query for more later. So how much do we cache? We can either # Make the user configure it, which requires creating new CQL syntax, or # Determine it automatically Personally I'd lean towards (2): # Track an EstimatedHistogram of LIMITs in qualifying queries # Set the cells-to-cache per CF so that we maximize the queries we can satisfy for a given cache size # I think this also means we should go back to a separate cache per CF with its own size limit -- if we have 1000 queries/s against CF X's cache, then we shouldn't throw those away when a query against CF Y comes in where we expect only 10/s In the interest of shipping sooner than later though I'll take whatever we can reasonably do for 2.1.0 and push the rest out to improve later. If we just have a single "cache this many cells" parameter in cassandra.yaml that's still better than people OOMing themselves with the classic row cache. > Query cache > ----------- > > Key: CASSANDRA-5357 > URL: https://issues.apache.org/jira/browse/CASSANDRA-5357 > Project: Cassandra > Issue Type: Bug > Reporter: Jonathan Ellis > Assignee: Marcus Eriksson > Fix For: 2.1 > > > I think that most people expect the row cache to act like a query cache, > because that's a reasonable model. Caching the entire partition is, in > retrospect, not really reasonable, so it's not surprising that it catches > people off guard, especially given the confusion we've inflicted on ourselves > as to what a "row" constitutes. > I propose replacing it with a true query cache. -- This message was sent by Atlassian JIRA (v6.1.5#6160)