On Thu, Jun 3, 2010 at 10:17 AM, David King <dk...@ketralnis.com> wrote: >>> So with the row cache, that first node (the primary replica) is the one >>> that has that row cached, yes? >> No, it's the closest node as determined by snitch.sortByProximity. > > And with the default snitch, rack-unaware placement, random partitioner, and > all nodes up, that's the primary replica, right?
No. When all replicas have equal weight it's basically random. >> any given node X will never know whether another node Y has a row cached or >> not. the overhead for communicating that level of detail would be totally >> prohibitive. all caching does is speed the read, once a request is received >> for data local to a given node. no more, no less. > > Yes, that's my concern, but the details significantly affect the effective > size of the cache (in the afoorementioned case, the details place the > effective size at either 6 million or 18 million, a 3x difference). > > So given CL==ONE reads, only the actually read node (which will be the > primary replica given the default placement strategy and snitch) will cache > the item, right? The checksum-checking doesn't cause the row to be cached on > the non-read nodes? You have to read the data, before you can checksum it. So on the contrary, digest (checksum) vs data read has no effect on cache behavior. > If I read with CL==QUORUM in an RF==3 environment, do both read nodes them > cache the item, or only the primary replica? Both. Which is what you want, otherwise your digest reads will cause substantial unnecessary i/o on hot keys. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com