> I tried setting the IO mode to standard, but it seemed to be a little slower
> and couldn't get the machine to come back online with adequate read
> performance, so I set it back. I'll have to write a solid cache warming
> script if I'm going to try that again.

What cache are you talking about?  Did you turn on row caching?

When we turned on row caching, repeat hits to the same rows was fast, of 
course, but we didnt (given our data access patterns) see significant 
differences compared to mmap-ing the data.  And once we hit the limit of our 
row cache, out-of-cache hits were pretty costly (dont have hard numbers, but I 
recall it being worse than having mmap page in/out).

Is your client making random reads of more rows than will fit in RAM on your 
box?  We found that in that scenario, after cassandra has used up all of the 
free memory on the box, using mmap was slightly worse than using standard data 
access.

We happened to be lucky that our real world data access is limited to a small 
subset of rows in any given time period, so mmap works great for us.  I guess 
the best thing to do is to try to figure out how to make a cassandra node only 
need to service requests for data that can fit into memory in a given time 
period.  More nodes, a lower replication factor, more memory, I guess...

Im definitely waiting to hear how things change with 0.6.2.

Kyusik Chung

Reply via email to