Hi Serega, Most of the content in the blog article is still relevant. After 1.2.5 (ic), there are only three new versions (ja, jb, ka) for SSTable format. Following are the changes in these versions.
// ja (2.0.0): super columns are serialized as composites (note that there is no real format change, // this is mostly a marker to know if we should expect super columns or not. We do need // a major version bump however, because we should not allow streaming of super columns // into this new format) // tracks max local deletiontime in sstable metadata // records bloom_filter_fp_chance in metadata component // remove data size and column count from data file (CASSANDRA-4180) // tracks max/min column values (according to comparator) // jb (2.0.1): switch from crc32 to adler32 for compression checksums // checksum the compressed data // ka (2.1.0): new Statistics.db file format // index summaries can be downsampled and the sampling level is persisted // switch uncompressed checksums to adler32 // tracks presense of legacy (local and remote) counter shards - bharat On Wed, Apr 1, 2015 at 12:02 AM, Serega Sheypak <serega.shey...@gmail.com> wrote: > Hi bharat, > you are talking about Cassandra 1.2.5 Does it fit Cassandra 2.1? > Were there any significant changes to SSTable format and layout? > Thank you, article is interesting. > > Hi jacob <jacob.rho...@me.com>, > HBase does it for example. > http://hbase.apache.org/book.html#_hfile_format_2 > It would be great to give general ideas. It could help to understand > schema design problems. You start to understand better how Cassandra scans > data how you can utilize its power. > > 2015-04-01 5:39 GMT+02:00 Bharatendra Boddu <bharatend...@gmail.com>: > >> Some time back I created a blog article about the SSTable storage format >> with some code references. >> >> Cassandra: SSTable Storage Format >> <http://distributeddatastore.blogspot.com/2013/08/cassandra-sstable-storage-format.html> >> >> - bharat >> >> On Mon, Mar 30, 2015 at 5:24 PM, Jacob Rhoden <jacob.rho...@me.com> >> wrote: >> >>> Yes updating code and documentation can sometimes be annoying, you would >>> only ever maintain both if it were important. It comes down or is having >>> the format of the data files documented for everyone to understand an >>> important thing? >>> >>> ______________________________ >>> Sent from iPhone >>> >>> On 31 Mar 2015, at 11:07 am, daemeon reiydelle <daeme...@gmail.com> >>> wrote: >>> >>> why? Then there are 2 places 2 maintain or get jira'ed for a discrepancy. >>> On Mar 30, 2015 4:46 PM, "Robert Coli" <rc...@eventbrite.com> wrote: >>> >>>> On Mon, Mar 30, 2015 at 1:38 AM, Pierre <pierredev...@gmail.com> wrote: >>>> >>>>> Does anyone know if there is a more complete and up to date >>>>> documentation about the sstable files structure (data, index, stats etc.) >>>>> than this one : http://wiki.apache.org/cassandra/ArchitectureSSTable >>>> >>>> >>>> No, there isn't. Unfortunately you will have to read the source. >>>> >>>> >>>>> I'm looking for a full specification, with schema of the structure if >>>>> possible. >>>>> >>>> >>>> It would be nice if such fundamental things were documented, wouldn't >>>> it? >>>> >>>> =Rob >>>> >>>> >>> >> >