Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Branimir Lambov Mon, 21 Nov 2022 05:54:39 -0800

I see. This does make a lot of sense for full row indexing, and also if one
can specify sub-kb granularity (at the current default we just won't have
an index in these cases). How does opening a ticket to do these two* after
the current code is committed sound?


* embedded index for sub-X-byte partitions + granularity in bytes

On Mon, Nov 21, 2022 at 3:38 PM Benedict <bened...@apache.org> wrote:

> Buffering on write up to at most one page seems fine? Once you are past a
> single page it’s fine to write either to the end of the partition or to a
> separate file, there’s nothing much to be gained, but esp. for small
> partitions there’s likely significant value in prepending it?
>
> It might be preferable to retain the separate index for those that
> overflow this buffer, and simply encode in the partition index whether the
> row index is inline or in the separate file.
>
> On 21 Nov 2022, at 13:29, Branimir Lambov <blam...@apache.org> wrote:
>
> 
> There is no intention to introduce any new versions of the format
> specifically for DSE. If there are any further changes to the format, they
> will be OSS-first. In other words this support only extends to preexisting
> versions of the format.
>
> Inline row index in the data file is not something we have implemented,
> and it's currently not in any plans. I personally am not sure how it can be
> done to provide a benefit: if we place it at the end of a partition, it
> does not help much compared to a separate file; if we place it in front, we
> have to buffer the partition content, which will affect write performance.
> In either case it may be harder to cache. Do you have something different
> in mind?
>
> Regards,
> Branimir
>
> On Mon, Nov 21, 2022 at 3:01 PM Benedict <bened...@apache.org> wrote:
>
>> Personally very pleased to see this proposal, and I’m not opposed to
>> easing your migration by maintaining some light support for internal file
>> versions - though would prefer the support have some version limit where it
>> can be excised (maybe for one minor version bump?)
>>
>> One implementation question: are there any plans to support inline row
>> index in the big sstable format files? Is this something DSE supports, and
>> on the roadmap just not for initial work, or currently not envisioned?
>>
>> I would anticipate significant advantage to this for many workloads, and
>> no downside (except for streaming - which could be resolved fairly easily
>> by skipping over these sections when streaming to an old node, but since we
>> don’t generally stream between versions I don’t see any major issue anyway).
>>
>>
>> On 21 Nov 2022, at 12:43, Branimir Lambov <blam...@apache.org> wrote:
>>
>> 
>> Hi everyone,
>>
>> We would like to put CEP-25 for discussion.
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format
>>
>> The proposal describes DSE's Big Trie-indexed SSTable format, which
>> replaces the primary index with on-disk tries to improve lookup performance
>> and index size, better handle wide partitions, and remove the need to
>> manage key caching and index summaries.
>>
>> We would like to discuss this proposal with you.
>>
>> One of the questions that we want to ask is whether anyone objects to
>> maintaining full compatibility with existing files created by DataStax
>> Enterprise.
>>
>> Regards,
>> Branimir
>>
>>
>
>
>

Re: [DISCUSS] CEP-25: Trie-indexed SSTable format

Reply via email to