Your reverse index of "which rows contain a column named X" will have very wide rows. You could look at cassandra's secondary indexing, or possibly look at a solandra/solr approach. Another option is you can shift the problem slightly, "which rows have column X that was added between time y and time z". Remember with few distinct column names that reverse index of column to row is going to be a very big list.
On Thu, Apr 4, 2013 at 5:45 PM, Drew Kutcharian <d...@venarc.com> wrote: > Hi Edward, > > I anticipate that the column names will be reused a lot. For example, key1 > will be in many rows. So I think the number of distinct column names will > be much much smaller than the number of rows. Is there a way to have a > separate CF that keeps track of the column names? > > What I was thinking was to have a separate CF that I write only the column > name with a null value in there every time I write a key/value to the main > CF. In this case if that column name exist, then it will just be > overridden. Now if I wanted to get all the column names, then I can just > query that CF. Not sure if that's the best approach at high load (100k > inserts a second). > > -- Drew > > > On Apr 4, 2013, at 12:02 PM, Edward Capriolo <edlinuxg...@gmail.com> > wrote: > > You can not get only the column name (which you are calling a key) you can > use get_range_slice which returns all the columns. When you specify an > empty byte array (new byte[0]{}) as the start and finish you get back all > the columns. From there you can return only the columns to the user in a > format that you like. > > > On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian <d...@venarc.com> wrote: > >> Hey Guys, >> >> I'm working on a project and one of the requirements is to have a schema >> free CF where end users can insert arbitrary key/value pairs per row. What >> would be the best way to know what are all the "keys" that were inserted >> (preferably w/o any locking). For example, >> >> Row1 => key1 -> XXX, key2 -> XXX >> Row2 => key1 -> XXX, key3 -> XXX >> Row3 => key4 -> XXX, key5 -> XXX >> Row4 => key2 -> XXX, key5 -> XXX >> … >> >> The query would be give me all the inserted keys and the response would >> be {key1, key2, key3, key4, key5} >> >> Thanks, >> >> Drew >> >> > >