Phil Stanhope <pstanhope <at> wimba.com> writes: > > How are you doing your inserts? > > I draw a clear line between 1) bootstrapping a cluster with data and 2) simulating expected/projected > read/write behavior. > > If you are bootstrapping then I would look into the batch_mutate APIs. They allow you to improve your > performance on writes dramatically. > > If you are read/write testing on a populated cluster, insert and batch_insert (for super columns) are the > way to go. > > As Ben has pointed to me in numerous threads ... think carefully about your replication factor. Do you want > the data on all nodes? Or sufficiently replicated so that you can recover? Do you want consistency at the > time of write? Or eventually? > > Cassandra has a bunch of knobs that you can turn ... but that flexibility requires that you think about your > expected usage patterns and operational policies. > > -phil >
My inserts are being done 100 rows at a time using batch_mutate(). I bring up all 10 nodes in my cassandra cluster at once (no live bootstrapping of nodes). Once they are up, I begin populating the database running 8 write clients (on 8 different VMs), each writing 100 rows at a time. As mentioned earlier, each client writes to a different cassandra server node so no one server node is fielding all the writes simultaneously. I have a replication factor of 3 because I need to be able to survive 2 out of 10 nodes going down at once. I am baffled by all the "Value too large" exceptions that are occurring on every one of my 10 servers: ERROR [MESSAGE-STREAMING-POOL:1] 2010-06-14 19:30:24,471 DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor java.lang.RuntimeException: java.io.IOException: Value too large for defined data type at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask (ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) It seems to be happening just after this is logged: INFO [AE-SERVICE-STAGE:1] 2010-06-14 19:28:39,851 StreamOut.java I'm also baffled that after all compactions are done on every one of the 10 servers, about 5 out of 10 servers are still at 40% CPU usage, although they are doing 0 disk IO. I am not running anything else running on these server nodes except for cassandra. The compactions have been done for over an hour. The last write took place 5 hours ago. Thank you for any help, Julie