Hi,

 I've been doing some tests using wide rows recently, and I've seen some
odd performance problems that I'd like to understand.

In particular, I've seen that the time it takes for Cassandra to perform a
column slice of a single key, solely in a Memtable, seems to be very
expensive, but most importantly proportional to the ordered position where
the start column of the slice lives.

In other words:
 1- if I start Cassandra fresh (with an empty ColumnFamily with TimeUUID
comparator)
 2- I create a single Row with Key "K"
 3- Then add 200K TimeUUID columns to key "K"
 4- (and make sure nothing is flushed to SSTables...so it's all in the
Memtable)

...I observe the following timings (secondds to perform 1000 reads) while
performing multiget slices on it:  (pardon the pseudo-code, but you'll get
the gist)

a) simply a get of the first column:  GET("K",:count=>1)
  --  2.351226

b) doing a slice get, starting from the first column:  GET("K",:start =>
'144abe16-416c-11e1-9e23-2cbae9ddfe8b' , :count => 1 )
  -- 2.189224   <<- so with or without "start" doesn't seem to make much of
a difference

c) doing a slice get, starting from the middle of the ordered
columns..approx starting at item number 100K:   GET("K",:start =>
'9c13c644-416c-11e1-81dd-4ba530dc83d0' , :count => 1 )
 -- 11.849326  <<- 5 times more expensive if the start of the slice is 100K
positions away

d) doing a slice get, starting from the last of the ordered columns..approx
position 200K:   GET("K",:start => '1c1b9b32-416d-11e1-83ff-dd2796c3abd7' ,
:count => 1 )
  -- 19.889741   <<- Almost twice as expensive than starting the slice at
position 100K, and 10 times more expensive than starting from the first one

This behavior leads me to believe that there's a clear Memtable column scan
for the columns of the key.
If one tries a column name read on those positions (i.e., not a slice), the
performance is constant. I.e., GET("K",
'144abe16-416c-11e1-9e23-2cbae9ddfe8b') . Retrieving the first, middle or
last timeUUID is done in the same amount of time.

Having increasingly worse performance for column slices in Memtables seems
to be a bit of a problem...aren't Memtables backed by a structure that has
some sort of column name indexing?...so that landing on the start column
can be efficient? I'm definitely observing very high CPU utilization on
those scans...By the way, with wide columns like this, slicing SSTables is
quite faster than slicing Memtables...I'm attributing that to the sampled
index of the SSTables, hence that's why I'm wondering if the Memtables do
not have such column indexing builtin and resort to linked lists of sort....

Note, that the actual timings shown are not important, it's in my laptop
and I have a small amount of debugging enabled...what it is important is
the difference between then.

I'm using Cassandra trunk as of Dec 1st, but I believe I've done
experiments with 0.8 series too, leading to the same issue.

 Cheers,

Josep M.

Reply via email to