LCS is generally used for high read vs. write ratio though it sounds like you 
may be doing a heavy write load instead.  LCS will involve more compactions as 
you write to the system compared to STCS because LCS is always trying to keep a 
1 to 10 ratio between levels.  While LCS will involve more compaction in 
general(more I/o, more cpu), I am not sure on update vs. insert though From 
what I understand STCS will happily duplicate rows across SS tables while LCS 
does not like to do this so as you update you will constantly compact….well, 
that is my understanding.  Have you tried STCS out at all?  (ps. This is just 
from what I understand so take with a grain of salt).

Also, there are some great tools in the nodetool tool as well so you can get 
nodetool compactionstats, etc. etc. and see how backlogged you are in pending 
tasks….how many pending?

Later,
Dean

From: Jay Svc <jaytechg...@gmail.com<mailto:jaytechg...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, March 26, 2013 6:08 PM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Subject: Re: Insert v/s Update performance

Thanks Dean,

I have used LCS with sstable_size_in_mb of 15. I have also tried bigger 
sstable_size_in_mb and observed simillar behavior.

Does compaction works differently for update v/s Insert? I belive all keys goes 
to single SST. What other options I should look into?

Thanks,
Jay




On Tue, Mar 26, 2013 at 6:18 PM, Hiller, Dean 
<dean.hil...@nrel.gov<mailto:dean.hil...@nrel.gov>> wrote:
Most likely compaction kicks in as updates cause duplicated rows in STCS and 
compaction causes load that may not have been there before(check your logs).  
Also, you can increase the number of nodes in your cluster as well to better 
handle the load.

Later,
Dean

From: Jay Svc 
<jaytechg...@gmail.com<mailto:jaytechg...@gmail.com><mailto:jaytechg...@gmail.com<mailto:jaytechg...@gmail.com>>>
Reply-To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Date: Tuesday, March 26, 2013 5:05 PM
To: 
"user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
Subject: Insert v/s Update performance

Hi Team,

I have this 3 node cluster. I am writing data to these node at the rate of 
2,000 records/second. What I observed that if I do inserts. (Means records for 
those keys does not exist, my column family has 0 records to start with) then I 
have better write performacne, low SSTable count, low pending compaction and 
write latency is acceptable and CPU utilization on each node between 35% to 85%.

When I ran same test but for update this time (means records already exists in 
Column family with same key), I observed that my SSTable count gone high 3 
times. Pending compactions gone high more than 2 times and write latency has 
gone high too and CPU utilization was almost 92% to 100%.

What is a reason of deteriorating Update performance v/s Insert performance. 
Since this is critical you help is highly appriciated.

P.S. I also observed that high number of pending Mutation Stage on my nodetool 
tpstats.

Thanks,
Jay

Reply via email to