Aaron,

   - Version of Cassandra is 1.0.7, python cql is 1.0.7
   - I AM shutting down the server and deleting the /var/lib/cassandra
   directory the starting it back up between tests
   - nodetool cfstats always looks like this: http://pastebin.com/95EAkZK5 with
   only the MutationStage Completed getting incremented as the batches
   complete
   - Not sure what I am looking at.  I guess the TotalWriteLatencyMicros:
   82950? See here: http://imgur.com/60sU5
   - I will look into the stress testing tools
   - My python script I am using is here: http://pastebin.com/Rt6wSq2h

I am running new tests today with 10000 inserts to get a better idea.  see
here:
https://docs.google.com/spreadsheet/ccc?key=0AoNsUtyQ2cAodGpxLV9MODNJV3BFMzBicUhnUjM2M0E

Thank you for the response.  Let me know if there is any more info I can
provide.

-Blake


On Tue, Jan 24, 2012 at 11:52 PM, aaron morton <aa...@thelastpickle.com>wrote:

> There are few slight differences in the execution paths, nothing jumps out
> (it *looks* like the authorization to write to the CF is checked for each
> statement in the batch, not sure how heavy that is.).
>
> If you send a batch with more statements that concurrent_writers in the
> yaml some of those statements will have to wait for an available writer
> before completing. This will introduce some latency to the query. You can
> check pending tasks using nodetool tpstats.
>
> Before we get into it further some thoughts:
>
> * what cassandra version ?
> * you are running the tests one after another on the same running
> cassandra process ? Or are you running it against a new process.
> * have a look at the nodetool cfstats to see the write latency for the cf,
> this is the latency for the actual write. Does it change ?
> * Use jconsole to look at the o.a.c.db.StorageProxy MBean, the latency
> there is for entire requests.
> * perhaps take a look at the stress testing tools in the distribution and
> see if their results concur with yours.
>
> If you are still having problems let us know and include the python
> script.
>
> Cheers
>
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 25/01/2012, at 3:33 PM, Blake Visin wrote:
>
> So I decided that it would be beneficial to use batching in my application
> since I am doing many, many inserts.  When I implemented batching in CQL
> using 'BEGIN BATCH'..'APPLY BATCH' I saw a significant decrease in the
> speed of inserts, no matter the number of insert statements I included
> between begin and apply.  I created a simple benchmark script in Python and
> posted the results here:
>
>
> https://docs.google.com/spreadsheet/ccc?key=0AoNsUtyQ2cAodGpxLV9MODNJV3BFMzBicUhnUjM2M0E
>
> As you can see, the larger I made the batches, the longer they took.
>
> Any ideas where to go from here?
>
> Thanks,
> Blake
>
>
>

Reply via email to