Range scan performance in 0.6.0 beta2

Henrik Schröder Thu, 25 Mar 2010 06:34:18 -0700

Hi everyone,

We're trying to implement a virtual datastore for our users where they can
set up "tables" and "indexes" to store objects and have them indexed on
arbitrary properties. And we did a test implementation for Cassandra in the
following way:


Objects are stored in one columnfamily, each key is made up of tableid +
"object key", and each row has one column where the value is the serialized
object. This part is super-simple, we're just using Cassandra as a
key-value-store, and this part performs really well.

The indexes are a bit tricker, but basically for each index and each object
that is stored, we compute a fixed-length bytearray based on the object that
make up the indexvalue. We then store these bytearray indexvalues in another
columnfamily, with the indexid as row key, the indexvalue as the column
name, and the object key as the column value.

The idea is then that to perform a range query on an "index" in this virtual
datastore, we do a get_slice to get the range of indexvalues and their
corresponding object keys, and we can then multi_get the actual objects from
the other column family.

Since these virtual tables and indexes will be created by our users the
whole system has to be very dynamic, and we can't make any assumptions about
the actual objects they will store and the distribution of these. We do know
that it must be able to scale well, and this is what attracted us to
Cassandra in the first place. We do however have some performance targets we
want to hit, we have one use-case where there will be about 30 million
records in a "table", and knowing that it can go up to 100 million records
would be nice. As for speed, would like to get thousands of writes and range
reads per second.

Given these requirements and our design, we will then have rows in Cassandra
with millions of columns, from which we want to fetch large column slices.
We set it all up on a single developer machine (MacPro, QuadCore 2.66ghz)
running Windows, and we used the thrift compiler to generate a C# client
library. We tested just the "index" part of our design, and these are the
numbers we got:
inserts (15 threads, batches of 10): 4000/second
get_slices (10 threads, random range sizes, count 1000): 50/second at start,
dies at about 6 million columns inserted. (OutOfMemoryException)
get_slices (10 threads, random range sizes, count 10): 200/s at start, slows
down the more columns there are.


When we saw that the above results were bad, we tried a different approach
storing the indexvalues in the key instead, using the
OrderPreservingPartitioner and using get_range_slice to get ranges of rows,
but we got even worse results:
inserts (15 threads, in batches of 10): 4000/second
get_range_slice (10 threads, random key ranges, count 1000): 20/second at
start, 5/second with 30 million rows


Finally, we did a similar test using MySQL instead and then we got these
numbers:
inserts (15 threads, in batches of 10): 4000/second
select (10 threads, limit 1000): 500/second

So for us, the MySQL version delivers the speed that we want, but none of
the scaling that Cassandra gives us. We set up our columnfamilies like this:

<ColumnFamily CompareWith="BytesType" Name="Objects" RowsCached="0"
KeysCached="0"/>
<ColumnFamily CompareWith="BytesType" Name="Indexes" RowsCached="0"
KeysCached="0"/>

And we now have these questions:
a) Is there a better way of structuring our data and building the virtual
indexes?
b) Are our Cassandra numbers too low? Or is this the expected performance?
c) Did we miss to change some important setting (in the conf xml or java
config) since our rows are this large?
d) Can we avoid hitting the Out of memory exception?


/Henrik

Range scan performance in 0.6.0 beta2

Reply via email to