Re: 15 seconds to increment 17k keys?

Richard Low Thu, 01 Sep 2011 03:36:58 -0700

Assuming you have replicate_on_write enabled (which you almost
certainly do for counters), you have to do a read on a write for each
increment.  This means counter increments, even if all your data set
fits in cache, are significantly slower than normal column inserts.  I
would say ~1k increments per second is about right, although you can
probably do some tuning to improve this.


I've also found that the pycassa client uses significant amounts of
CPU, so be careful you are not CPU bound on the client.

-- 
Richard Low
Acunu | http://www.acunu.com | @acunu

On Thu, Sep 1, 2011 at 2:31 AM, Yang <teddyyyy...@gmail.com> wrote:
> 1ms per add operation is the general order of magnitude I have seen with my
> tests.
>
>
> On Wed, Aug 31, 2011 at 6:04 PM, Ian Danforth <idanfo...@numenta.com> wrote:
>>
>> All,
>>
>>  I've got a 4 node cluster (ec2 m1.large instances, replication = 3)
>> that has one primary counter type column family, that has one column
>> in the family. There are millions of rows. Each operation consists of
>> doing a batch_insert through pycassa, which increments ~17k keys. A
>> majority of these keys are new in each batch.
>>
>>  Each operation is taking up to 15 seconds. For our system this is a
>> significant bottleneck.
>>
>>  Does anyone know if this write speed is expected?
>>
>> Thanks in advance,
>>
>>  Ian
>
>

Re: 15 seconds to increment 17k keys?

Reply via email to