Please realize that I do not make any decisions here and I am not part of the core Cassandra developer team.
What has been said before is that they will most likely go away and at least under the hood be replaced by composite columns. Jonathan have however stated that he would like the supercolumn API/abstraction to remain at least for backwards compatibility. Please understand that under the hood, supercolumns are merely groups of columns serialized as a single block of data. The fact that there is a specialized and hardcoded way to serialize these column groups into supercolumns is a problem however and they should probably go away to make space for a more generic implementation allowing more flexible data structures and less code specific for one special data structure. Today there are tons of extra code to deal with the slight difference in serialization and features of supercolumns vs columns and hopefully most of that could go away if things got structured a bit different. I also hope that we keep APIs to allow simple access to groups of key/value pairs to simplify application logic as working with just columns can add a lot of application code which should not be needed. If you almost always need all or mostly all of the columns in a supercolumn, and you normally update all of them at the same time, they will most likely be faster than normal columns. Processing wise, you will actually do a bit more work on serialization/deserialization of SC's but the I/O part will usually be better grouped/require less operations. I think we did some benchmarks on some heavy use cases with ~30 small columns per SC some time back and I think we ended up with SCs being 10-20% faster. Terje On Jan 5, 2012, at 2:37 PM, Aklin_81 wrote: > I have seen supercolumns usage been discouraged most of the times. > However sometimes the supercolumns seem to fit the scenario most > appropriately not only in terms of how the data is stored but also in > terms of how is it retrieved. Some of the queries supported by SCs are > uniquely capable of doing the task which no other alternative schema > could do.(Like recently I asked about getting the equivalent of > retrieving a list of (full)supercolumns by name, through use of > composite columns, unfortunately there was no way to do this without > reading lots of extra columns). > > So I am really confused whether: > > 1. Should I really not use the supercolumns for any case at all, > however appropriate, or I just need to be just careful while realizing > that supercolumns fit my use case appropriately or what!? > > 2. Are there any performance concerns with supercolumns even in the > cases where they are used most appropriately. Like when you need to > retrieve the entire supercolumns everytime & max. no of subcolumns > vary between 0-10. > (I don't write all the subcolumns inside supercolumn, at once though! > Does this also matter?) > > 3. What is their future? Are they going to be deprecated or may be > enhanced later?