Thanks a lot. This is really helpful. John
On Tue, May 24, 2011 at 1:34 PM, Victor Kabdebon <victor.kabde...@gmail.com>wrote: > It's not really possible to give a general answer your second question, it > depends of your implementation. Personally I do two thing : the first one is > to map arrays with a key and then name of column as a key of your array and > value of column as the data storage. However for some application, as I am > using Java I just serialize my ArrayList (or List) and push all the content > to one column. It all depends on what you want to achieve. > > Third question: try to make CF according to what you want to achieve. I am > designing an internal messaging system I use only two column family to hold > the message lists, message and message box. I would have used one; but I > need one that is sorted by TimeUUID and the other one by UTF8Type. I think > there is a general consensus here : try to avoid super columns. 2 sets of > columns can do the same jobs has one SuperColumn and it's > the preferred scheme. > > Again just experiment and be ready to change your organization if you begin > with Cassandra, this is the best way to figure out what to do for your data > organization. > > Victor Kabdebon > http://www.voxnucleus.fr > http://www.victorkabdebon.net > > > 2011/5/24 Jian Fang <jian.fang.subscr...@gmail.com> > >> Does anyone have a good suggestion on my second question? I believe that >> question is a pretty common one. >> >> My third question is a design question. For the same data, we can stored >> them into multiple column families or a single column family with multiple >> super columns. >> From Cassandra read/write performance point of view, what are the general >> rules to make mutliple column families and when to use a single column >> family? >> >> Thanks again, >> >> John >> >> >> On Mon, May 23, 2011 at 5:47 PM, Jian Fang <jian.fang.subscr...@gmail.com >> > wrote: >> >>> Hi, >>> >>> I am pretty new to Cassandra and am going to use Cassandra 0.8.0. I have >>> two questions (sorry if they are very basic ones): >>> >>> 1) I have a column family to hold many super columns, say 30. When I >>> first insert the data to the column family, do I need to insert each column >>> one at a time or can I insert the whole column family in one transaction (or >>> call?)? The latter one seems to be more efficient to me. Does Cassandra >>> support that? >>> >>> For example, I saw the following code to do insertion (with Hector), >>> >>> Mutator m = HFactory.createMutator(keyspace, stringSerializer); >>> //Mutator<String> m = >>> HFactory.createMutator(keyspace,stringSerializer); >>> m.insert(p.getCassandraKey(), colFamily, >>> HFactory.createStringColumn("type", >>> p.getStringValue())); >>> m.insert(p.getCassandraKey(), colFamily, >>> HFactory.createColumn("data", >>> p.getCompressedXML(), StringSerializer.get(), >>> BytesArraySerializer.get())); >>> >>> Will the insertions be two separate calls to Cassandra? Or they are just >>> one transaction? If it is the former case, is there any way to make them as >>> one call to Cassandra? >>> >>> 2) How to store a list/array of data in Cassandra? For example, I have a >>> data field called categories, which include none or many categories and each >>> category includes a category id and a category description. Usually, how do >>> people handle this scenario when they use Cassandra? >>> >>> Thanks in advance, >>> >>> John >>> >> >> >