I see. This does make a lot of sense for full row indexing, and also if one can specify sub-kb granularity (at the current default we just won't have an index in these cases). How does opening a ticket to do these two* after the current code is committed sound?
* embedded index for sub-X-byte partitions + granularity in bytes On Mon, Nov 21, 2022 at 3:38 PM Benedict <bened...@apache.org> wrote: > Buffering on write up to at most one page seems fine? Once you are past a > single page it’s fine to write either to the end of the partition or to a > separate file, there’s nothing much to be gained, but esp. for small > partitions there’s likely significant value in prepending it? > > It might be preferable to retain the separate index for those that > overflow this buffer, and simply encode in the partition index whether the > row index is inline or in the separate file. > > On 21 Nov 2022, at 13:29, Branimir Lambov <blam...@apache.org> wrote: > > > There is no intention to introduce any new versions of the format > specifically for DSE. If there are any further changes to the format, they > will be OSS-first. In other words this support only extends to preexisting > versions of the format. > > Inline row index in the data file is not something we have implemented, > and it's currently not in any plans. I personally am not sure how it can be > done to provide a benefit: if we place it at the end of a partition, it > does not help much compared to a separate file; if we place it in front, we > have to buffer the partition content, which will affect write performance. > In either case it may be harder to cache. Do you have something different > in mind? > > Regards, > Branimir > > On Mon, Nov 21, 2022 at 3:01 PM Benedict <bened...@apache.org> wrote: > >> Personally very pleased to see this proposal, and I’m not opposed to >> easing your migration by maintaining some light support for internal file >> versions - though would prefer the support have some version limit where it >> can be excised (maybe for one minor version bump?) >> >> One implementation question: are there any plans to support inline row >> index in the big sstable format files? Is this something DSE supports, and >> on the roadmap just not for initial work, or currently not envisioned? >> >> I would anticipate significant advantage to this for many workloads, and >> no downside (except for streaming - which could be resolved fairly easily >> by skipping over these sections when streaming to an old node, but since we >> don’t generally stream between versions I don’t see any major issue anyway). >> >> >> On 21 Nov 2022, at 12:43, Branimir Lambov <blam...@apache.org> wrote: >> >> >> Hi everyone, >> >> We would like to put CEP-25 for discussion. >> >> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-25%3A+Trie-indexed+SSTable+format >> >> The proposal describes DSE's Big Trie-indexed SSTable format, which >> replaces the primary index with on-disk tries to improve lookup performance >> and index size, better handle wide partitions, and remove the need to >> manage key caching and index summaries. >> >> We would like to discuss this proposal with you. >> >> One of the questions that we want to ask is whether anyone objects to >> maintaining full compatibility with existing files created by DataStax >> Enterprise. >> >> Regards, >> Branimir >> >> > > >