Well, it appears that this just isn't possible. I created CASSANDRA-5959 as a result. (Backstory + performance testing results are described in the issue):
https://issues.apache.org/jira/browse/CASSANDRA-5959 -- Les Hazlewood | @lhazlewood CTO, Stormpath | http://stormpath.com | @goStormpath | 888.391.5282 On Thu, Aug 29, 2013 at 12:04 PM, Les Hazlewood <lhazlew...@apache.org>wrote: > Hi all, > > We're using a Cassandra table to store search results in a > table/column family that that look like this: > > +--------+---------+---------+---------+---- > | | 0 | 1 | 2 | ... > +--------+---------+---------+---------+---- > | row_id | text... | text... | text... | ... > > The column name is the index # (an integer) of the location in the > overall result set. The value is the result at that particular index. > This is great because pagination becomes a simple slice query on the > column name. > > Large result sets are split into multiple rows - we're limiting row > size on disk to be around 6 or 7 MB. For our particular result > entries, this means we can get around 50,000 columns in a single row. > > When we create the rows, we have the entire data available in the > application at the time the row insert is necessary. > > Using CQL3, an initial implementation had one INSERT statement per > column. This was killing performance (not to mention the # of > tombstones it created). > > Here's the CQL3 table definition: > > create table query_results ( > row_id text, > shard_num int, > list_index int, > result text, > primary key (row_id, shard_num), list_index)) > with compact storage > > (the row key is row_id + shard_num. The 'cluster column' is list_index). > > I don't want to execute 50,000 INSERT statements for a single row. We > have all of the data up front - I want to execute a single INSERT. > > Is this possible? > > We're using the Datastax Java Driver. > > Thanks for any help! > > Les >