Re: [DISCUSS] CEP-19: Trie memtable implementation

Branimir Lambov Thu, 10 Feb 2022 03:05:26 -0800

Let us continue the configuration discussion in the CEP-11 JIRA (
https://issues.apache.org/jira/browse/CASSANDRA-17034).


Any further comments on the alternate memtable? Are we ready for a vote?

Regards,
Branimir


On Wed, Feb 9, 2022 at 12:13 PM Bowen Song <bo...@bso.ng> wrote:

> TBH, I don't have an opinion on the configuration. I just want to say that
> if at the end we decide the configuration in the YAML should override the
> table schema, I would like to recommend that we specifying a list of
> whitelisted (or blacklisted) "templates" in the YAML file, and the template
> chosen by the table schema is used if it's enabled, otherwise fallback to a
> default template, which could be the first element in the whitelist if
> that's used, or a separate configuration entry if a blacklist is used. The
> list should be optional in the YAML, and an empty list or the absent of it
> means everything is enabled.
>
> Advantage of this:
>
> 1. it doesn't require the operator to configure this, as an empty or
> absent list by default enables all templates and should work fine in most
> cases.
>
> 2. it allows the operator to whitelist / blacklist any template if ever
> needed (e.g. due to a bug), and also allow them to choose a fallback option.
>
> 3. the table schema has priority as long as the chosen template is not
> explicitly disabled by the YAML.
>
> 4. it allows the operator to selectively disable some templates without
> forcing all tables to use the same template specified by the YAML.
>
>
> On 09/02/2022 09:43, bened...@apache.org wrote:
>
> Why not have some default templates that can be specified by the schema
> without touching the yaml, but overridden in the yaml as necessary?
>
>
>
> *From: *Branimir Lambov <blam...@apache.org> <blam...@apache.org>
> *Date: *Wednesday, 9 February 2022 at 09:35
> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
> <dev@cassandra.apache.org>
> *Subject: *Re: [DISCUSS] CEP-19: Trie memtable implementation
>
> If I understand this correctly, you prefer _not_ to have an option to give
> the configuration explicitly in the schema. I.e. force the configurations
> ("templates" in current terms) to be specified in the yaml, and only allow
> tables to specify which one to use among them?
>
>
>
> This does sound at least as good to me, and I'll happily change the API.
>
>
>
> Regards,
>
> Branimir
>
>
>
> On Tue, Feb 8, 2022 at 10:40 PM Dinesh Joshi <djo...@apache.org> wrote:
>
> My quick reading of the code suggests that schema will override the
> operator's default preference in the YAML. In the event of a bug in the new
> implementation, there could be situation where the operator might need to
> override this via the YAML.
>
>
>
> On Feb 8, 2022, at 12:29 PM, Jeremiah D Jordan <jeremiah.jor...@gmail.com>
> wrote:
>
>
>
> I don’t really see most users touching the default implementation.  I
> would expect the main reason someone would change would be
>
> 1. They run into some bug that is only in one of the implementations.
>
> 2. They have persistent memory and so want to use
> https://issues.apache.org/jira/browse/CASSANDRA-13981
>
>
>
> Given that I doubt most people will touch it, I think it is good to give
> advanced operators the ability to have more control over switching to
> things that have new performance characteristics.  So I like the idea that
> the proposed configuration approach which allows someone to change to a new
> implementation one node at a time and only for specific tables.
>
>
>
> On Feb 8, 2022, at 2:21 PM, Dinesh Joshi <djo...@apache.org> wrote:
>
>
>
> Thank you for sharing the perf test results.
>
>
>
> Going back to the schema vs yaml configuration. I am concerned users may
> pick the wrong implementation for their use-case. Is there any chance for
> us to automatically pick a MemTable implementation based on heuristics? Do
> we foresee users ever picking the existing SkipList implementation over the
> Trie Given the performance tests, it seems the Trie implementation is the
> clear winner.
>
>
>
> To be clear, I am not suggesting we remove the existing implementation. I
> am for maintaining a pluggable API for various components.
>
>
>
> Dinesh
>
>
>
> On Feb 7, 2022, at 8:39 AM, Branimir Lambov <blam...@apache.org> wrote:
>
>
>
> Added some performance results to the ticket:
> https://issues.apache.org/jira/browse/CASSANDRA-17240
>
>
>
> Regards,
>
> Branimir
>
>
>
> On Sat, Feb 5, 2022 at 10:59 PM Dinesh Joshi <djo...@apache.org> wrote:
>
> This is excellent. Thanks for opening up this CEP. It would be great to
> get some stats around GC allocation rate / memory pressure, read & write
> latencies, etc. compared to existing implementation.
>
>
>
> Dinesh
>
>
>
> On Jan 18, 2022, at 2:13 AM, Branimir Lambov <blam...@apache.org> wrote:
>
>
>
> The memtable pluggability API (CEP-11) is per-table to enable memtable
> selection that suits specific workflows. It also makes full sense to permit
> per-node configuration, both to be able to modify the configuration to suit
> heterogeneous deployments better, as well as to test changes for
> improvements such as this one.
>
> Recognizing this, the patch comes with a modification to the API
> <https://github.com/blambov/cassandra/commit/24b558ba2f71a2f040804e28993cc914b31298f5>
> that defines memtable templates in cassandra.yaml (i.e. per node) and
> allows the schema to select a template (in addition to being able to
> specify the full memtable configuration). One could use this e.g. by adding:
>
> *memtable_templates*:
>     *trie*:
>         *class*: TrieMemtable
>         *shards*: 16
>     *skiplist*:
>         *class*: SkipListMemtable*memtable*:
>     *template*: skiplist
>
> (which defines two templates and specifies the default memtable
> implementation to use) to cassandra.yaml and specifying  *WITH memtable =
> {'template' : 'trie'} *in the table schema.
>
>
>
> I intend to commit this modification with the memtable API
> (CASSANDRA-17034/CEP-11).
>
>
>
> Performance comparisons will be published soon.
>
>
>
> Regards,
>
> Branimir
>
>
>
> On Fri, Jan 14, 2022 at 4:15 PM Jeff Jirsa <jji...@gmail.com> wrote:
>
> Sounds like a great addition
>
>
>
> Can you share some of the details around gc and latency improvements
> you’ve observed with the list?
>
>
>
> Any specific reason the confirmation is through schema vs yaml? Presumably
> it’s so a user can test per table, but this changes every host in a
> cluster, so the impact of a bug/regression is much higher.
>
>
>
>
>
> On Jan 10, 2022, at 1:30 AM, Branimir Lambov <blam...@apache.org> wrote:
>
> 
>
> We would like to contribute our TrieMemtable to Cassandra.
>
>
>
>
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-19%3A+Trie+memtable+implementation
>
>
>
> This is a new memtable solution aimed to replace the legacy
> implementation, developed with the following objectives:
>
> - lowering the on-heap complexity and the ability to store memtable
> indexing structures off-heap,
>
> - leveraging byte order and a trie structure to lower the memory footprint
> and improve mutation and lookup performance.
>
>
>
> The new memtable relies on CASSANDRA-6936 to translate to and from
> byte-ordered representations of types, and CASSANDRA-17034 / CEP-11 to plug
> into Cassandra. The memtable is built on multiple shards of custom
> in-memory single-writer multiple-reader tries, whose implementation uses a
> combination of state-of-the-art and novel features for greater efficiency.
>
>
>
> The CEP's JIRA ticket (
> https://issues.apache.org/jira/browse/CASSANDRA-17240) contains the
> initial version of the implementation. In its current form it achieves much
> better garbage collection latency, significantly bigger data sizes between
> flushes for the same memory allocation, as well as drastically increased
> write throughput, and we expect the memory and garbage collection
> improvements to go much further with upcoming improvements to the solution.
>
>
>
> I am interested in hearing your thoughts on the proposal.
>
>
>
> Regards,
>
> Branimir
>
>
>
>
>
>
>
>
>
>
>
>

Re: [DISCUSS] CEP-19: Trie memtable implementation

Reply via email to