We are starting to use cassandra to power our activity feed. The way we
organize our data is simple. "Event"s live in a CF called Events and are
keyed by a UUID. The timelines themselves live in a CF called Timelines,
which is keyed by user id (i.e. "1229") and contains a event uuids as column
names (sorted by TimeUUIDType).

To load a feed, we get a slice of the timeline CF for that user, then
multiget all of the corresponding events.

Loading the slice of the timeline is reasonably fast at 4-6ms. But,
multigetting the events is terribly slow - on the order of 35-100ms.

To alleviate the problem, we write events through to memcached and use a
memcached multiget in front of the cassandra multiget. We have enough cache
space to get upwards of a 99% hit rate, which makes loading the events
extremely fast, but it would be nice to make use of the 24GB of memory in
our cassandra nodes.

We're on 0.6, and I've enabled the row cache. It seems to have data in it,
but it's still slow.

So, am I doing something wrong, or is this the expected perf?

- James

Reply via email to