Re: SSTable structure

Serega Sheypak Thu, 02 Apr 2015 02:13:45 -0700

Thank you, great to know that.

2015-04-01 23:14 GMT+02:00 Bharatendra Boddu <[email protected]>:


> Hi Serega,
>
> Most of the content in the blog article is still relevant. After 1.2.5
> (ic), there are only three new versions (ja, jb, ka) for SSTable format.
> Following are the changes in these versions.
>
>         // ja (2.0.0): super columns are serialized as composites (note that 
> there is no real format change,
>         //               this is mostly a marker to know if we should expect 
> super columns or not. We do need
>         //               a major version bump however, because we should not 
> allow streaming of super columns
>         //               into this new format)
>         //             tracks max local deletiontime in sstable metadata
>         //             records bloom_filter_fp_chance in metadata component
>         //             remove data size and column count from data file 
> (CASSANDRA-4180)
>         //             tracks max/min column values (according to comparator)
>         // jb (2.0.1): switch from crc32 to adler32 for compression checksums
>         //             checksum the compressed data
>         // ka (2.1.0): new Statistics.db file format
>         //             index summaries can be downsampled and the sampling 
> level is persisted
>         //             switch uncompressed checksums to adler32
>         //             tracks presense of legacy (local and remote) counter 
> shards
>
> - bharat
>
> On Wed, Apr 1, 2015 at 12:02 AM, Serega Sheypak <[email protected]>
> wrote:
>
>> Hi bharat,
>> you are talking about Cassandra 1.2.5 Does it fit Cassandra 2.1?
>> Were there any significant changes to SSTable format and layout?
>> Thank you, article is interesting.
>>
>> Hi jacob <[email protected]>,
>> HBase does it for example.
>> http://hbase.apache.org/book.html#_hfile_format_2
>> It would be great to give general ideas. It could help to understand
>> schema design problems. You start to understand better how Cassandra scans
>> data how you can utilize its power.
>>
>> 2015-04-01 5:39 GMT+02:00 Bharatendra Boddu <[email protected]>:
>>
>>> Some time back I created a blog article about the SSTable storage format
>>> with some code references.
>>>
>>> Cassandra: SSTable Storage Format
>>> <http://distributeddatastore.blogspot.com/2013/08/cassandra-sstable-storage-format.html>
>>>
>>> - bharat
>>>
>>> On Mon, Mar 30, 2015 at 5:24 PM, Jacob Rhoden <[email protected]>
>>> wrote:
>>>
>>>> Yes updating code and documentation can sometimes be annoying, you
>>>> would only ever maintain both if it were important. It comes down or is
>>>> having the format of the data files documented for everyone to understand
>>>> an important thing?
>>>>
>>>> ______________________________
>>>> Sent from iPhone
>>>>
>>>> On 31 Mar 2015, at 11:07 am, daemeon reiydelle <[email protected]>
>>>> wrote:
>>>>
>>>> why? Then there are 2 places 2 maintain or get jira'ed for a
>>>> discrepancy.
>>>> On Mar 30, 2015 4:46 PM, "Robert Coli" <[email protected]> wrote:
>>>>
>>>>> On Mon, Mar 30, 2015 at 1:38 AM, Pierre <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Does anyone know if there is a more complete and up to date
>>>>>> documentation about the sstable files structure (data, index, stats etc.)
>>>>>> than this one : http://wiki.apache.org/cassandra/ArchitectureSSTable
>>>>>
>>>>>
>>>>> No, there isn't. Unfortunately you will have to read the source.
>>>>>
>>>>>
>>>>>> I'm looking for a full specification, with schema of the structure if
>>>>>> possible.
>>>>>>
>>>>>
>>>>> It would be nice if such fundamental things were documented, wouldn't
>>>>> it?
>>>>>
>>>>> =Rob
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: SSTable structure

Reply via email to