For sanity, I ran the same python script with the same row ids again today and it was 10x faster. Must be something going wrong intermittently in my cluster.
-Allan On April 11, 2014 at 11:02:11 AM, Allan C (alla...@gmail.com) wrote: It’s a fairly standard relational-like CF. Description is the only field that’s potentially big (can be up to 1k). CREATE COLUMN FAMILY 'Event' WITH key_validation_class = 'UTF8Type' AND comparator = 'UTF8Type' AND default_validation_class = 'UTF8Type' AND bloom_filter_fp_chance = 0.1 AND compaction_strategy = 'LeveledCompactionStrategy' AND compaction_strategy_options = {sstable_size_in_mb:160} AND compression_options = {sstable_compression:SnappyCompressor,chunk_length_kb:64} AND -- key_alias = 'eventId' AND column_metadata = [ {column_name: 'createdAt', validation_class: 'DateType'}, {column_name: 'creatorId', validation_class: 'UTF8Type'}, {column_name: 'creatorName', validation_class: 'UTF8Type'}, {column_name: 'description', validation_class: 'UTF8Type'}, {column_name: 'privacy', validation_class: 'UTF8Type'}, {column_name: 'location', validation_class: 'UTF8Type'}, {column_name: 'locationId', validation_class: 'UTF8Type'}, {column_name: 'endTime', validation_class: 'DateType'}, {column_name: 'name', validation_class: 'UTF8Type'}, {column_name: 'picture', validation_class: 'UTF8Type'}, {column_name: 'startTime', validation_class: 'DateType'}, {column_name: 'updatedAt', validation_class: 'DateType'}, {column_name: 'lat', validation_class: 'UTF8Type'}, {column_name: 'lng', validation_class: 'UTF8Type'}, {column_name: 'street', validation_class: 'UTF8Type'}, {column_name: 'city', validation_class: 'UTF8Type'}, {column_name: 'state', validation_class: 'UTF8Type'}, {column_name: 'zip', validation_class: 'UTF8Type'}, {column_name: 'country', validation_class: 'UTF8Type'}, {column_name: '~lastSync', validation_class: 'DateType'}, {column_name: '~nextSync', validation_class: 'DateType'}, {column_name: '~syncBlock', validation_class: 'IntegerType'}, {column_name: 'noCount', validation_class: 'IntegerType'}, {column_name: 'invitedCount', validation_class: 'IntegerType'}, {column_name: 'maybeCount', validation_class: 'IntegerType'}, {column_name: 'yesCount', validation_class: 'IntegerType'}, {column_name: '~version', validation_class: 'IntegerType'} ]; -Allan On April 10, 2014 at 4:49:34 PM, Tyler Hobbs (ty...@datastax.com) wrote: On Thu, Apr 10, 2014 at 6:26 PM, Allan C <alla...@gmail.com> wrote: Looks like the amount of data returned has a big effect. When I only return one column, python reports only 20ms compared to 150ms when returning the whole row. Rows are each less than 1k in size, but there must be client overhead. That's a surprising amount of overhead in pycassa. What's your schema like for this CF? -- Tyler Hobbs DataStax