It's not really possible to give a general answer your second question, it depends of your implementation. Personally I do two thing : the first one is to map arrays with a key and then name of column as a key of your array and value of column as the data storage. However for some application, as I am using Java I just serialize my ArrayList (or List) and push all the content to one column. It all depends on what you want to achieve.
Third question: try to make CF according to what you want to achieve. I am designing an internal messaging system I use only two column family to hold the message lists, message and message box. I would have used one; but I need one that is sorted by TimeUUID and the other one by UTF8Type. I think there is a general consensus here : try to avoid super columns. 2 sets of columns can do the same jobs has one SuperColumn and it's the preferred scheme. Again just experiment and be ready to change your organization if you begin with Cassandra, this is the best way to figure out what to do for your data organization. Victor Kabdebon http://www.voxnucleus.fr http://www.victorkabdebon.net 2011/5/24 Jian Fang <jian.fang.subscr...@gmail.com> > Does anyone have a good suggestion on my second question? I believe that > question is a pretty common one. > > My third question is a design question. For the same data, we can stored > them into multiple column families or a single column family with multiple > super columns. > From Cassandra read/write performance point of view, what are the general > rules to make mutliple column families and when to use a single column > family? > > Thanks again, > > John > > > On Mon, May 23, 2011 at 5:47 PM, Jian Fang > <jian.fang.subscr...@gmail.com>wrote: > >> Hi, >> >> I am pretty new to Cassandra and am going to use Cassandra 0.8.0. I have >> two questions (sorry if they are very basic ones): >> >> 1) I have a column family to hold many super columns, say 30. When I first >> insert the data to the column family, do I need to insert each column one at >> a time or can I insert the whole column family in one transaction (or >> call?)? The latter one seems to be more efficient to me. Does Cassandra >> support that? >> >> For example, I saw the following code to do insertion (with Hector), >> >> Mutator m = HFactory.createMutator(keyspace, stringSerializer); >> //Mutator<String> m = >> HFactory.createMutator(keyspace,stringSerializer); >> m.insert(p.getCassandraKey(), colFamily, >> HFactory.createStringColumn("type", >> p.getStringValue())); >> m.insert(p.getCassandraKey(), colFamily, >> HFactory.createColumn("data", >> p.getCompressedXML(), StringSerializer.get(), >> BytesArraySerializer.get())); >> >> Will the insertions be two separate calls to Cassandra? Or they are just >> one transaction? If it is the former case, is there any way to make them as >> one call to Cassandra? >> >> 2) How to store a list/array of data in Cassandra? For example, I have a >> data field called categories, which include none or many categories and each >> category includes a category id and a category description. Usually, how do >> people handle this scenario when they use Cassandra? >> >> Thanks in advance, >> >> John >> > >