RE: How to increase cassandra's performance in read?

Mark Jones Tue, 20 Apr 2010 09:12:44 -0700

Sorry, I didn't answer your question in my response, I have at this point:


Key(ID)
    When/Where SuperColumn Tag:  and Columns {Data: One Value (not yet written, 
tags, flags)}


Under some keys (very small #) there will be 2 values like:

Key(ID)
    When/Where SuperColumn Tag:  and Columns {Data: One Value (not yet written, 
tags, flags)}
    When/Where SuperColumn Tag:  and Columns {Data: One Value (not yet written, 
tags, flags)}
    Long term this list will be in the 1000's possibly millions

-----Original Message-----
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Tuesday, April 20, 2010 10:47 AM
To: user@cassandra.apache.org
Subject: Re: How to increase cassandra's performance in read?

How many columns are in the supercolumn total?

"in super columnfamilies there is a third level of subcolumns; these
are not indexed, and any request for a subcolumn deserializes _all_
the subcolumns in that supercolumn"

http://wiki.apache.org/cassandra/CassandraLimitations

On Tue, Apr 20, 2010 at 9:50 AM, Mark Jones <mjo...@imagehawk.com> wrote:
> I too am seeing very slow performance while testing worst case scenarios of
> 1 key leading to 1 supercolumn and 1 column beyond that.
>
>
>
> Key -> SuperColumn -> 1 Column (of ~ 500 bytes)
>
>
>
> Drive utilization is 80-90% and I'm only dealing with 50-70 million rows.
> (With NO swapping)  So far, I've found nothing that helps, including
> increasing the keycache FROM 200k-500k keys, I'm guessing the hashing
> prevents better cache performance.
>
>
>
> Read performance is definitely not 3 IOs based on the utilization factors on
> my drives.  I'm not sure the issue was ever settled in the previous e-mails
> as to how to calculate how many IOs were being done for each read.  I've
> been testing with clusters of 1,2,3 or 4 machines and so far all I'm seeing
> with multiple machines, is lower performance in a cluster than alone.  I
> keep assuming that at some number of nodes, the performance will begin to
> pick up.  Three of my nodes are running with 8GB (6GB Java Heap), and one
> has 4GB (3GB Java Heap).  The machine with the smallest memory footprint is
> the fastest performer on inserts, but definitely not the fastest on reads.
>
>
>
> I'm suspecting the read path is relying heavily on the fact that you want to
> get many columns that are closely related, because lookup by key appears to
> be incredibly slow.
>
>
>
> From: yangfeng [mailto:yea...@gmail.com]
> Sent: Tuesday, April 20, 2010 7:59 AM
> To: user@cassandra.apache.org; d...@cassandra.apache.org
> Subject: How to increase cassandra's performance in read?
>
>
>
> I  get 10 columns Family by keys and  one columns Family has 30 columns.
>
> I use multigetSlice once to get 10 column Family.but the performance is so
> poor.
>
> anyone has other  thought to increase the performance.
>
>

RE: How to increase cassandra's performance in read?

Reply via email to