Not sure why the first paragraph turned in to a numbered bullet... On Sun, May 2, 2010 at 11:00 AM, James Golick <jamesgol...@gmail.com> wrote:
> > 1. I wrote the list a while back about less-than-great performance when > reading thousands of columns even on cache hits. Last night, I decided to > try to get to the bottom of why. > > > I tested this by setting the row cache capacity on a TimeUUIDType-sorted CF > to 10, filling up a single row with 2000 columns, and only running queries > against that row. That row was the only thing in the database. I rm -Rf'd > the data before starting the test. > > The tests were done from Coda Hale's scala client cassie, which is just a > thin layer around the java thrift bindings. I didn't actually time each call > because that wasn't the objective, but I didn't really need to. Reads of 10 > columns felt quick enough, but 100 columns was slower. 1000 columns would > frequently cause the client to timeout. The cache hit rate on that CF was > 1.0, so, yes, the row was in cache. > > Doing a thousand reads with count=100 in a single thread pegged my > macbook's CPU and caused the fans to spin up pretty loud. > > So, I attached a profiler and repeated the test. I'm no expert on cassandra > internals, so please let me know if I'm way off here. The profiled reads > were reversed=true, count=100. > > As far as I can tell, there are three components taking up most of the time > on this type of read (row slice out of cache): > > 1. ColumnFamilystore.removeDeleted() @ ~40% - Most of the time in here > is actually spent materializing UUID objects so that they can be compared > in > the ConcurrentSkipListMap (ColumnFamily.columns_). > 2. SliceQueryFilter.getMemColumnIterator @ ~30% - Virtually all the > time in here is spent in ConcurrentSkipListMap$Values.toArrray() > 3. QueryFilter.collectCollatedColumns @ ~30% - All the time being spent > in ColumnFamily.addColumn, and about half of the total spent materializing > UUIDs for comparison. > > This profile is consistent with the decrease in performance with higher > values for count. If there are more UUIDs to deserialize, the performance of > removeDeleted(), and collectCollatedColumns() should increase (roughly) > linearly. > > So, my question at this point is how to fix it. I have some basic ideas, > but being new to cassandra internals, I'm not sure they make any sense. Help > me out here: > > 1. Optionally call removeDeleted() less often. I realize that this is > probably a bad idea for a lot of reasons, but it was the first thing I > thought of. > 2. When a ColumnFamily object is put in to the row cache, copy the > columns over to another data structure that doesn't need to be sorted on > get(). If columns_ needs to be kept around, this option would have a memory > impact, but at least for us, it'd be well worth it for the speed. > 3. ???? > > I'd love to hear feedback on these / the rest of this (long) post. >