On Wed, Jan 18, 2012 at 2:44 AM, Josep Blanquer <blanq...@rightscale.com> wrote:
> Hi,
>
>  I've been doing some tests using wide rows recently, and I've seen some odd
> performance problems that I'd like to understand.
>
> In particular, I've seen that the time it takes for Cassandra to perform a
> column slice of a single key, solely in a Memtable, seems to be very
> expensive, but most importantly proportional to the ordered position where
> the start column of the slice lives.
>
> In other words:
>  1- if I start Cassandra fresh (with an empty ColumnFamily with TimeUUID
> comparator)
>  2- I create a single Row with Key "K"
>  3- Then add 200K TimeUUID columns to key "K"
>  4- (and make sure nothing is flushed to SSTables...so it's all in the
> Memtable)
>
> ...I observe the following timings (secondds to perform 1000 reads) while
> performing multiget slices on it:  (pardon the pseudo-code, but you'll get
> the gist)
>
> a) simply a get of the first column:  GET("K",:count=>1)
>   --  2.351226
>
> b) doing a slice get, starting from the first column:  GET("K",:start =>
> '144abe16-416c-11e1-9e23-2cbae9ddfe8b' , :count => 1 )
>   -- 2.189224   <<- so with or without "start" doesn't seem to make much of
> a difference
>
> c) doing a slice get, starting from the middle of the ordered
> columns..approx starting at item number 100K:   GET("K",:start =>
> '9c13c644-416c-11e1-81dd-4ba530dc83d0' , :count => 1 )
>  -- 11.849326  <<- 5 times more expensive if the start of the slice is 100K
> positions away
>
> d) doing a slice get, starting from the last of the ordered columns..approx
> position 200K:   GET("K",:start => '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' ,
> :count => 1 )
>   -- 19.889741   <<- Almost twice as expensive than starting the slice at
> position 100K, and 10 times more expensive than starting from the first one
>
> This behavior leads me to believe that there's a clear Memtable column scan
> for the columns of the key.
> If one tries a column name read on those positions (i.e., not a slice), the
> performance is constant. I.e., GET("K",
> '144abe16-416c-11e1-9e23-2cbae9ddfe8b') . Retrieving the first, middle or
> last timeUUID is done in the same amount of time.
>
> Having increasingly worse performance for column slices in Memtables seems
> to be a bit of a problem...aren't Memtables backed by a structure that has
> some sort of column name indexing?...so that landing on the start column can
> be efficient? I'm definitely observing very high CPU utilization on those
> scans...By the way, with wide columns like this, slicing SSTables is quite
> faster than slicing Memtables...I'm attributing that to the sampled index of
> the SSTables, hence that's why I'm wondering if the Memtables do not have
> such column indexing builtin and resort to linked lists of sort....
>
> Note, that the actual timings shown are not important, it's in my laptop and
> I have a small amount of debugging enabled...what it is important is the
> difference between then.
>
> I'm using Cassandra trunk as of Dec 1st, but I believe I've done experiments
> with 0.8 series too, leading to the same issue.

You may want to retry your experiments on current trunk. We do had inefficiency
in our memtable search that was fixed by:
https://issues.apache.org/jira/browse/CASSANDRA-3545
(the name of the ticket doesn't make it clear that it's related but it is)

The issue was committed on December 8.

--
Sylvain

>
>  Cheers,
>
> Josep M.

Reply via email to