Re: Cassandra Collections performance issue

Agrawal, Pratik Wed, 24 Feb 2016 14:11:37 -0800

Hi Daemeon,

We tried changing the behavior "we overwrite every value" to update only 1 
element in the map, and still we saw the same performance degradation.

Thanks,
Pratik

From: daemeon reiydelle <daeme...@gmail.com<mailto:daeme...@gmail.com>>
Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Date: Tuesday, February 9, 2016 at 11:39 AM
To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" 
<user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
Cc: "Peddi, Praveen" <pe...@amazon.com<mailto:pe...@amazon.com>>
Subject: Re: Cassandra Collections performance issue

I think the key to your problem might be around "we overwrite every value". You 
are creating a large number of tombstones, forcing many reads to pull current 
results. You would do well to rethink why you are having to to overwrite values 
all the time under the same key. You would be better to figure out haw to add 
values under a key then age off the old values. I would say that (at least at 
scale) you have a classic anti-pattern in play.

.......

Daemeon C.M. Reiydelle
USA (+1) 415.501.0198
London (+44) (0) 20 8144 9872

On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli 
<rc...@eventbrite.com<mailto:rc...@eventbrite.com>> wrote:
On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, Pratik 
<paagr...@amazon.com<mailto:paagr...@amazon.com>> wrote:
Recently we added one of the table fields from as Map<text, text> in Cassandra 
2.1.11. Currently we read every field from Map and overwrite map values. Map is 
of size 3. We saw that writes are 30-40% slower while reads are 70-80% slower. 
Please find below some metrics that can help.

My question is, Are there any known issues in Cassandra map performance?  As I 
understand it each of the CQL3 Map entry, maps to a column in cassandra, with 
that assumption we are just creating 3 columns right? Any insight on this issue 
would be helpful.

I have previously heard reports along similar lines, but in the other direction.

eg - "I moved from a collection to a TEXT column with JSON in it, and my reads 
and writes both became much faster!"

I'm not sure if the issue has been raised as an Apache Cassandra Jira, iow if 
it is a known and expected limitation as opposed to just a performance issue.

If I were you, I would consider filing a repro case as a Jira ticket, and 
responding to this thread with its URL. :D

=Rob

Re: Cassandra Collections performance issue

Reply via email to