Re: Effective cache size

Jonathan Ellis Thu, 03 Jun 2010 19:16:40 -0700

On Thu, Jun 3, 2010 at 10:17 AM, David King <dk...@ketralnis.com> wrote:
>>> So with the row cache, that first node (the primary replica) is the one 
>>> that has that row cached, yes?
>> No, it's the closest node as determined by snitch.sortByProximity.
>
> And with the default snitch, rack-unaware placement, random partitioner, and 
> all nodes up, that's the primary replica, right?


No.  When all replicas have equal weight it's basically random.

>> any given node X will never know whether another node Y has a row cached or 
>> not.  the overhead for communicating that level of detail would be totally 
>> prohibitive. all caching does is speed the read, once a request is received 
>> for data local to a given node.  no more, no less.
>
> Yes, that's my concern, but the details significantly affect the effective 
> size of the cache (in the afoorementioned case, the details place the 
> effective size at either 6 million or 18 million, a 3x difference).
>
> So given CL==ONE reads, only the actually read node (which will be the 
> primary replica given the default placement strategy and snitch) will cache 
> the item, right? The checksum-checking doesn't cause the row to be cached on 
> the non-read nodes?

You have to read the data, before you can checksum it.  So on the
contrary, digest (checksum) vs data read has no effect on cache
behavior.

> If I read with CL==QUORUM in an RF==3 environment, do both read nodes them 
> cache the item, or only the primary replica?

Both.  Which is what you want, otherwise your digest reads will cause
substantial unnecessary i/o on hot keys.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Effective cache size

Reply via email to