Re: Cassandra Collections performance issue

Benedict Elliott Smith Wed, 10 Feb 2016 09:42:46 -0800

If the overwrites are per map key there are no tombstones generated; only
if the whole map is re-imaged are tombstones created, and prior to 3.0 this
indeed can be major problem if done frequently.


Prior to 3.0 collections also forbid certain optimisations to cell
comparisons, and as a result can yield appreciable performance decline when
they're added to a table. Unfortunately dropping the collection won't
resolve the performance degradation, as its prior presence continues to
haunt the table. To restore performance you will need to recreate your
table without the collection column and reinsert your data. Or upgrade to
3.0.


On 9 February 2016 at 16:39, daemeon reiydelle <daeme...@gmail.com> wrote:

> I think the key to your problem might be around "we overwrite every
> value". You are creating a large number of tombstones, forcing many reads
> to pull current results. You would do well to rethink why you are having to
> to overwrite values all the time under the same key. You would be better to
> figure out haw to add values under a key then age off the old values. I
> would say that (at least at scale) you have a classic anti-pattern in play.
>
>
> *.......*
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198
> <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
> <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Mon, Feb 8, 2016 at 5:23 PM, Robert Coli <rc...@eventbrite.com> wrote:
>
>> On Mon, Feb 8, 2016 at 2:10 PM, Agrawal, Pratik <paagr...@amazon.com>
>> wrote:
>>
>>> Recently we added one of the table fields from as Map<text, text> in 
>>> *Cassandra
>>> 2.1.11*. Currently we read every field from Map and overwrite map
>>> values. Map is of size 3. We saw that writes are 30-40% slower while reads
>>> are 70-80% slower. Please find below some metrics that can help.
>>>
>>> My question is, Are there any known issues in Cassandra map
>>> performance?  As I understand it each of the CQL3 Map entry, maps to a
>>> column in cassandra, with that assumption we are just creating 3 columns
>>> right? Any insight on this issue would be helpful.
>>>
>>
>> I have previously heard reports along similar lines, but in the other
>> direction.
>>
>> eg - "I moved from a collection to a TEXT column with JSON in it, and my
>> reads and writes both became much faster!"
>>
>> I'm not sure if the issue has been raised as an Apache Cassandra Jira,
>> iow if it is a known and expected limitation as opposed to just a
>> performance issue.
>>
>> If I were you, I would consider filing a repro case as a Jira ticket, and
>> responding to this thread with its URL. :D
>>
>> =Rob
>>
>>
>
>

Re: Cassandra Collections performance issue

Reply via email to