i have a question on what are the signs from cassandra that new nodes should be added to the cluster. We are currently seeing long read times from the one node that has about 70GB of data with 60GB in one column family. we are using a replication factor of 3. I have tracked down the slow to occur when either row-read-stage or message-deserializer-pool is high like atleast 4000. my systems are 16core, 3 TB, 48GB mem servers. we would like to be able to use more of the server than just 70GB.
The system is a realtime system that needs to scale quite large. Our current heap size is 25GB and are getting atleast 50% row cache hit rates. Does it seem strange that cassandra is not able to handle the work load? We perform multislice gets when reading similar to twissandra does. this is to cut down on the network ops. Looking at iostat it doesnt appear to have alot of queued reads. What are others seeing when they have to add new nodes? What data sizes are they seeing? This is needed so we can plan our growth and server purchase strategy. thanx Artie -- http://yeslinux.org http://yestech.org