Re: Insert v/s Update performance

Jay Svc Fri, 29 Mar 2013 12:29:03 -0700

Hi Aaron,

Thank you for your input. I have been monitoring my GC activities and
looking at my Heap, it shows pretty linear activities, without any spikes.


When I look at CPU it shows higher utilization while during writes alone. I
also expect hevy read traffic.

When I tried compaction_throughput_* parameter, I obsered that higher
number here in my case gets better CPU utilization and keeps pending
compactions pretty low. How this parameter works? I have 3 nodes and 2 core
each CPU and I have higher writes.

So usually for high *update* and high read situation what parameter we
should consider for tuning?

Thanks,
Jay





On Wed, Mar 27, 2013 at 9:55 PM, aaron morton <[email protected]>wrote:

> * Check for GC activity in the logs
> * check the volume the commit log is on to see it it's over utilised.
> * check if the dropped messages correlate to compaction, look at the
> compaction_* settings in yaml and consider reducing the throughput.
>
> Like Dean says if you have existing data it will result in more
> compaction. You may be able to get a lot of writes through in a clean new
> cluster, but it also has to work when compaction and repair are running.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 27/03/2013, at 1:43 PM, Jay Svc <[email protected]> wrote:
>
> Thanks Dean again!
>
> My use case is high number of reads and writes out of that I am just
> focusing on write now. I thought LCS is a suitable for my situation. I
> tried simillar on STCS and results are same.
>
> I ran nodetool for tpstats and MutationStage pending are very high. At the
> same time the SSTable count and Pending Compaction are high too during my
> updates.
>
> Please find the snapshot of my syslog.
>
> INFO [ScheduledTasks:1] 2013-03-26 15:05:48,560 StatusLogger.java (line
> 116) OpsCenter.rollups86400                    0,0
> INFO [FlushWriter:55] 2013-03-26 15:05:48,608 Memtable.java (line 264)
> Writing Memtable-InventoryPrice@1051586614(11438914/129587272
> serialized/live bytes, 404320 ops)
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,561 MessagingService.java
> (line 658) 2701 MUTATION messages dropped in last 5000ms
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,562 StatusLogger.java (line
> 57) Pool Name                    Active   Pending   Blocked
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,563 StatusLogger.java (line
> 72) ReadStage                         0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,568 StatusLogger.java (line
> 72) RequestResponseStage              0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,627 StatusLogger.java (line
> 72) ReadRepairStage                   0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,627 StatusLogger.java (line
> 72) MutationStage                    32     19967         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line
> 72) ReplicateOnWriteStage             0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line
> 72) GossipStage                       0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,628 StatusLogger.java (line
> 72) AntiEntropyStage                  0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line
> 72) MigrationStage                    0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line
> 72) StreamStage                       0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,629 StatusLogger.java (line
> 72) MemtablePostFlusher               1         1         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line
> 72) FlushWriter                       1         1         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line
> 72) MiscStage                         0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,673 StatusLogger.java (line
> 72) commitlog_archiver                0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line
> 72) InternalResponseStage             0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line
> 72) HintedHandoff                     0         0         0
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,674 StatusLogger.java (line
> 77) CompactionManager                 1        27
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,675 StatusLogger.java (line
> 89) MessagingService                n/a      0,22
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,724 StatusLogger.java (line
> 99) Cache Type                     Size
> Capacity
> KeysToSave                                                         Provider
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line
> 100) KeyCache                     142315
> 2118997
> all
>  INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line
> 106) RowCache                          0
> 0                      all
> org.apache.cassandra.cache.SerializingCacheProvider
> INFO [ScheduledTasks:1] 2013-03-26 15:05:53,725 StatusLogger.java (line
> 113) ColumnFamily                Memtable ops,data
> INFO [ScheduledTasks:1] 2013-03-26 15:0
>
> Thanks,
> Jay
>
>
>
>
> On Tue, Mar 26, 2013 at 7:15 PM, Hiller, Dean <[email protected]>wrote:
>
>> LCS is generally used for high read vs. write ratio though it sounds like
>> you may be doing a heavy write load instead.  LCS will involve more
>> compactions as you write to the system compared to STCS because LCS is
>> always trying to keep a 1 to 10 ratio between levels.  While LCS will
>> involve more compaction in general(more I/o, more cpu), I am not sure on
>> update vs. insert though From what I understand STCS will happily duplicate
>> rows across SS tables while LCS does not like to do this so as you update
>> you will constantly compact….well, that is my understanding.  Have you
>> tried STCS out at all?  (ps. This is just from what I understand so take
>> with a grain of salt).
>>
>> Also, there are some great tools in the nodetool tool as well so you can
>> get nodetool compactionstats, etc. etc. and see how backlogged you are in
>> pending tasks….how many pending?
>>
>> Later,
>> Dean
>>
>> From: Jay Svc <[email protected]<mailto:[email protected]>>
>> Reply-To: "[email protected]<mailto:[email protected]>" <
>> [email protected]<mailto:[email protected]>>
>> Date: Tuesday, March 26, 2013 6:08 PM
>> To: "[email protected]<mailto:[email protected]>" <
>> [email protected]<mailto:[email protected]>>
>> Subject: Re: Insert v/s Update performance
>>
>> Thanks Dean,
>>
>> I have used LCS with sstable_size_in_mb of 15. I have also tried bigger
>> sstable_size_in_mb and observed simillar behavior.
>>
>> Does compaction works differently for update v/s Insert? I belive all
>> keys goes to single SST. What other options I should look into?
>>
>> Thanks,
>> Jay
>>
>>
>>
>>
>> On Tue, Mar 26, 2013 at 6:18 PM, Hiller, Dean <[email protected]
>> <mailto:[email protected]>> wrote:
>> Most likely compaction kicks in as updates cause duplicated rows in STCS
>> and compaction causes load that may not have been there before(check your
>> logs).  Also, you can increase the number of nodes in your cluster as well
>> to better handle the load.
>>
>> Later,
>> Dean
>>
>> From: Jay Svc <[email protected]<mailto:[email protected]
>> ><mailto:[email protected]<mailto:[email protected]>>>
>> Reply-To: "[email protected]<mailto:[email protected]
>> ><mailto:[email protected]<mailto:[email protected]>>" <
>> [email protected]<mailto:[email protected]><mailto:
>> [email protected]<mailto:[email protected]>>>
>> Date: Tuesday, March 26, 2013 5:05 PM
>> To: "[email protected]<mailto:[email protected]><mailto:
>> [email protected]<mailto:[email protected]>>" <
>> [email protected]<mailto:[email protected]><mailto:
>> [email protected]<mailto:[email protected]>>>
>> Subject: Insert v/s Update performance
>>
>> Hi Team,
>>
>> I have this 3 node cluster. I am writing data to these node at the rate
>> of 2,000 records/second. What I observed that if I do inserts. (Means
>> records for those keys does not exist, my column family has 0 records to
>> start with) then I have better write performacne, low SSTable count, low
>> pending compaction and write latency is acceptable and CPU utilization on
>> each node between 35% to 85%.
>>
>> When I ran same test but for update this time (means records already
>> exists in Column family with same key), I observed that my SSTable count
>> gone high 3 times. Pending compactions gone high more than 2 times and
>> write latency has gone high too and CPU utilization was almost 92% to 100%.
>>
>> What is a reason of deteriorating Update performance v/s Insert
>> performance. Since this is critical you help is highly appriciated.
>>
>> P.S. I also observed that high number of pending Mutation Stage on my
>> nodetool tpstats.
>>
>> Thanks,
>> Jay
>>
>>
>
>

Re: Insert v/s Update performance

Reply via email to