Got it. Thanks again, Aaron. -- Y.
On Tue, Dec 4, 2012 at 3:07 PM, aaron morton <aa...@thelastpickle.com>wrote: > Does this mean we should not enable row caches until we are absolutely > sure about what's hot (I think there is a reason why row caches are > disabled by default) ? > > Yes and Yes. > Row cache takes memory and CPU, unless you know you are getting a benefit > from it leave it off. The key cache and os disk cache will help. If you > find latency is an issue then start poking around. > > Cheers > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 5/12/2012, at 4:23 AM, Yiming Sun <yiming....@gmail.com> wrote: > > Hi Aaron, > > Thank you,and your explanation makes sense. At the time, I thought having > 1GB of row cache on each node was plenty enough, because there was an > aggregated 6GB cache, but you are right, with each row in 10's of MBs, some > of the nodes can go into a constant load and evict cycle and would have > negative effects on the performance. I will try as you suggested to 1.) > reduce the requested entry set, and 2.) increase the row cache size and see > if they get better hits, and also do 3) by reversing the requested entry > list in alternate runs. > > Our data space has close to 3 million rows, but we haven't gotten enough > usage statistics to know what rows are hot. Does this mean we should not > enable row caches until we are absolutely sure about what's hot (I think > there is a reason why row caches are disabled by default) ? It also seems > from my test that OS page cache works much better, but it could be that OS > page cache can utilize all the available memory so it is essentially larger > -- I guess I will find out by doing 2.) above. > > best, > > -- Y. > > > > On Tue, Dec 4, 2012 at 4:47 AM, aaron morton <aa...@thelastpickle.com>wrote: > >> > Row Cache : size 1072651974 (bytes), capacity 1073741824 >> (bytes), 0 hits, 2576 requests, NaN recent hit rate, 0 save period in >> seconds >> >> So the cache is pretty much full, there is only 1 MB free. >> >> There were 2,576 read requests that tried to get a row from the cache. >> Zero of those had a hit. If you have 6 nodes and RF 2, each node has one >> third of the data in the cluster (from the effective ownership info). So >> depending on the read workload the number of read requests on each node may >> be different. >> >> What I think is happening is reads are populating the row cache, then >> subsequent reads are evicting items from the row cache before you get back >> to reading the original rows. So if you read rows 1 to 5, they are put in >> the cache, when you read rows 6 to 10 they are put in and evict rows 1 to >> 5. Then you read rows 1 to 5 again they are not in the cache. >> >> Try testing with a lower number of hot rows, and/or a bigger row cache. >> >> But to be honest, with rows in the 10's of MB you will probably only get >> good cache performance with a small set of hot rows. >> >> Hope that helps. >> >> >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 1/12/2012, at 5:11 AM, Yiming Sun <yiming....@gmail.com> wrote: >> >> > Does anyone have any comments/suggestions for me regarding this? Thanks >> > >> > >> > I am trying to understand some strange behavior of cassandra row cache. >> We have a 6-node Cassandra cluster in a single data center on 2 racks, and >> the neighboring nodes on the ring are from alternative racks. Each node >> has 1GB row cache, with key cache disabled. The cluster uses >> PropertyFileSnitch, and the ColumnFamily I fetch from uses >> NetworkTopologyStrategy, with replication factor of 2. My client code uses >> Hector to fetch a fixed set of rows from cassandra >> > >> > What I don't quite understand is even after I ran the client code >> several times, there are always some nodes with 0 row cache hits, despite >> that the row cache from all nodes are filled and all nodes receive requests. >> > >> > Which nodes have 0 hits seem to be strongly related to the following: >> > >> > - the set of row keys to fetch >> > - the order of the set of row keys to fetch >> > - the list of hosts passed to Hector's CassandraHostConfigurator >> > - the order of the list of hosts passed to Hector >> > >> > Can someone shed some lights on how exactly the row cache works and >> hopefully also explain the behavior I have been seeing? I thought if the >> fixed set of the rows keys are the only thing I am fetching (each row >> should be on the order of 10's of MBs, no more than 100MB), and each node >> gets requests, and its row cache is filled, there's gotta be some hits. >> Apparent this is not the case. Thanks. >> > >> > cluster information: >> > >> > Address DC Rack Status State Load >> Effective-Ownership Token >> > >> 141784319550391026443072753096570088105 >> > x.x.x.1 DC1 r1 Up Normal 587.46 GB >> 33.33% 0 >> > x.x.x.2 DC1 r2 Up Normal 591.21 GB >> 33.33% 28356863910078205288614550619314017621 >> > x.x.x.3 DC1 r1 Up Normal 594.97 GB >> 33.33% 56713727820156410577229101238628035242 >> > x.x.x.4 DC1 r2 Up Normal 587.15 GB >> 33.33% 85070591730234615865843651857942052863 >> > x.x.x.5 DC1 r1 Up Normal 590.26 GB >> 33.33% 113427455640312821154458202477256070484 >> > x.x.x.6 DC1 r2 Up Normal 583.21 GB >> 33.33% 141784319550391026443072753096570088105 >> > >> > >> > [user@node]$ ./checkinfo.sh >> > *************** x.x.x.4 >> > Token : 85070591730234615865843651857942052863 >> > Gossip active : true >> > Thrift active : true >> > Load : 587.15 GB >> > Generation No : 1354074048 >> > Uptime (seconds) : 36957 >> > Heap Memory (MB) : 2027.29 / 3948.00 >> > Data Center : DC1 >> > Rack : r2 >> > Exceptions : 0 >> > >> > Key Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 >> requests, NaN recent hit rate, 14400 save period in seconds >> > Row Cache : size 1072651974 (bytes), capacity 1073741824 >> (bytes), 0 hits, 2576 requests, NaN recent hit rate, 0 save period in >> seconds >> > >> > *************** x.x.x.6 >> > Token : 141784319550391026443072753096570088105 >> > Gossip active : true >> > Thrift active : true >> > Load : 583.21 GB >> > Generation No : 1354074461 >> > Uptime (seconds) : 36535 >> > Heap Memory (MB) : 828.71 / 3948.00 >> > Data Center : DC1 >> > Rack : r2 >> > Exceptions : 0 >> > >> > Key Cache : size 0 (bytes), capacity 0 (bytes), 0 hits, 0 >> requests, NaN recent hit rate, 14400 save period in seconds >> > Row Cache : size 1072602906 (bytes), capacity 1073741824 >> (bytes), 0 hits, 3194 requests, NaN recent hit rate, 0 save period in >> seconds >> > >> > >> >> > >