Aaron, - Version of Cassandra is 1.0.7, python cql is 1.0.7 - I AM shutting down the server and deleting the /var/lib/cassandra directory the starting it back up between tests - nodetool cfstats always looks like this: http://pastebin.com/95EAkZK5 with only the MutationStage Completed getting incremented as the batches complete - Not sure what I am looking at. I guess the TotalWriteLatencyMicros: 82950? See here: http://imgur.com/60sU5 - I will look into the stress testing tools - My python script I am using is here: http://pastebin.com/Rt6wSq2h
I am running new tests today with 10000 inserts to get a better idea. see here: https://docs.google.com/spreadsheet/ccc?key=0AoNsUtyQ2cAodGpxLV9MODNJV3BFMzBicUhnUjM2M0E Thank you for the response. Let me know if there is any more info I can provide. -Blake On Tue, Jan 24, 2012 at 11:52 PM, aaron morton <aa...@thelastpickle.com>wrote: > There are few slight differences in the execution paths, nothing jumps out > (it *looks* like the authorization to write to the CF is checked for each > statement in the batch, not sure how heavy that is.). > > If you send a batch with more statements that concurrent_writers in the > yaml some of those statements will have to wait for an available writer > before completing. This will introduce some latency to the query. You can > check pending tasks using nodetool tpstats. > > Before we get into it further some thoughts: > > * what cassandra version ? > * you are running the tests one after another on the same running > cassandra process ? Or are you running it against a new process. > * have a look at the nodetool cfstats to see the write latency for the cf, > this is the latency for the actual write. Does it change ? > * Use jconsole to look at the o.a.c.db.StorageProxy MBean, the latency > there is for entire requests. > * perhaps take a look at the stress testing tools in the distribution and > see if their results concur with yours. > > If you are still having problems let us know and include the python > script. > > Cheers > > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 25/01/2012, at 3:33 PM, Blake Visin wrote: > > So I decided that it would be beneficial to use batching in my application > since I am doing many, many inserts. When I implemented batching in CQL > using 'BEGIN BATCH'..'APPLY BATCH' I saw a significant decrease in the > speed of inserts, no matter the number of insert statements I included > between begin and apply. I created a simple benchmark script in Python and > posted the results here: > > > https://docs.google.com/spreadsheet/ccc?key=0AoNsUtyQ2cAodGpxLV9MODNJV3BFMzBicUhnUjM2M0E > > As you can see, the larger I made the batches, the longer they took. > > Any ideas where to go from here? > > Thanks, > Blake > > >