I think my earlier response vanished into the moderator queue. Just a few comments:
1) The Paxos latency (and correctness) improvements I think should land in 4.0.x, as we have introduced a fairly significant regression and this work mostly resolves outstanding issues with LWTs today. 2) If we aim to deliver multi-partition LWTs in 4.x/5.0, we may likely want to pair this with work to further reduce latency beyond the above work, as contention will become a more significant problem. Should I be involved in delivering multi-partition LWTs I will also be aiming to deliver even lower latencies for the release they land in. 3) To support all of the above work, I also aim to deliver a Simulator facility for deterministically executing cluster workloads under adversarial scheduling (i.e. that intercepts all message and thread events and evaluates them sequentially, in pseudorandom order), alongside linearizability verification built upon this. This work will include (or have as a prerequisite) significant clean-ups to internal functionality like executors, use of futures and other concurrency primitives, and mocking out of time and the filesystem. On 23/04/2021, 14:46, "Benjamin Lerer" <b.le...@gmail.com> wrote: Hi everybody, Thanks for all the responses. I went through the emails and aggregated the proposals to give us an idea on where we stand at this point. I only included the improvements in the list and left on the side the bug fixes. Regarding bug fixes, I wonder if we should not have discussions every month to discuss what are the important issues that should be fixed in priority. I feel that we sometimes tend to forget old issues even if they are more important than some new ones. Do not hesitate to tell me if I missed something or misinterpreted some proposal. *Query side improvements:* * Storage Attached Index or SAI. The CEP can be found at https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-7%3A+Storage+Attached+Index * Add support for OR predicates in the CQL where clause * Allow to aggregate by time intervals (CASSANDRA-11871) and allow UDFs in GROUP BY clause * Ability to read the TTL and WRITE TIME of an element in a collection (CASSANDRA-8877) * Multi-Partition LWTs * Materialized views hardening: Addressing the different Materialized Views issues (see CASSANDRA-15921 and [1] for some of the work involved) *Security improvements:* * SSTables encryption (CASSANDRA-9633) * Add support for Dynamic Data Masking (CEP pending) * Allow the creation of roles that have the ability to assign arbitrary privileges, or scoped privileges without also granting those roles access to database objects. * Filter rows from system and system_schema based on users permissions (CASSANDRA-15871) *Performance improvements:* * Trie-based index format (CEP pending) * Trie-based memtables (CEP pending) * Paxos improvements: Paxos / LWT implementation that would enable the database to serve serial writes with two round-trips and serial reads with one round-trip in the uncontended case *Safety/Usability improvements:* * Guardrails. The CEP can be found at https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails * Add ability to track state in repair (CASSANDRA-15399) * Repair coordinator improvements (CASSANDRA-15399) * Make incremental backup configurable per keyspace and table (CASSANDRA-15402) * Add ability to blacklist a CQL partition so all requests are ignored (CASSANDRA-12106) * Add default and required keyspace replication options (CASSANDRA-14557) * Transactional Cluster Metadata: Use of transactions to propagate cluster metadata * Downgrade-ability: Ability to downgrade to downgrade in the event that a serious issue has been identified *Pluggability improvements:* * Pluggable schema manager (CEP pending) * Pluggable filesystem (CEP pending) * Pluggable authenticator for CQLSH (CASSANDRA-16456). A CEP draft can be found at https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit * Memtable API (CEP pending). The goal being to allow improvements such as CASSANDRA-13981 to be easily plugged into Cassandra *Memtable pluggable implementation:* * Enable Cassandra for Persistent Memory (CASSANDRA-13981) *Other tools:* * CQL compatibility test suite (CEP pending) Le jeu. 22 avr. 2021 à 16:11, Benjamin Lerer <b.le...@gmail.com> a écrit : > Finally, I think it's important we work to maintain trunk in a shippable >> state. > > > I am +100 on this. Bringing Cassandra to such a state was a huge effort > and keeping it that way will help us to ensure the quality of the > releases. > > Le jeu. 15 avr. 2021 à 17:30, Scott Andreas <sc...@paradoxica.net> a > écrit : > >> Thanks for starting this discussion, Benjamin! >> >> I share others’ enthusiasm on this thread for improvements to secondary >> indexes, trie-based partition indexes, guardrails, and encryption at rest. >> >> Here are some other post-4.0 areas for investment that have been on my >> mind: >> >> – Transactional Cluster Metadata >> Migrating from optimistic modification and propagation of cluster >> metadata via gossip to a transactional implementation opens a lot of >> possibilities. Token movements and instance replacements get safer and >> faster. Schema changes can be made atomic, enabling users to execute DDL >> rapidly without waiting for convergence. Operations like expansions and >> shrinks become easier to automate with less care and feeding. >> >> – Paxos improvements >> During discussion on C-12126, Benedict expressed interest in post-4.0 >> improvements that can be made to Cassandra’s Paxos / LWT implementation >> that would enable the database to serve serial writes with two round-trips >> and serial reads with one round-trip in the uncontended case. For many >> cross-WAN serial use cases, this may halve the latency of CAS queries. >> >> – Multi-Partition LWTs >> LWT is a great primitive, but modeling applications with the constraint >> of single-key CAS can be a game of Twister. Extending the paxos >> improvements discussed above to enable multi-partition CAS would enable >> users of Apache Cassandra to perform serial operations across partition >> boundaries. >> >> – Downgrade-ability >> I also see “downgradeability” as important to future new release >> adoption. Taking file format changes as an example, it’s currently not >> possible to downgrade in the event that a serious issue has been identified >> – unless you’re able to host-replace yourself out after upgrading one >> replica, or revert to a pre-upgrade snapshot and accept data loss. It would >> be excellent if it were possible for v.next to continue writing the >> previous SSTable/commitlog/hint/etc. format until a switch is flipped to >> opt into new file formats. Apache HDFS takes a similar approach, enabling >> downgrade until NameNode metadata is finalized [1]. This would be an >> excellent capability to have in Apache Cassandra, and dramatically lower >> the stakes for new release adoption. >> >> On pluggability / disaggregation: >> I agree that these are important themes. We’ll want to bring a lot of >> care and attention to this work. Disaggregation can open a lot of >> possibilities - with the drawback of future changes being restricted to the >> defined interface and an inability to optimize across interface boundaries. >> We can probably hit a sweet spot, though. >> >> Toolchains to validate implementations of pluggable components will >> become very important. It would be bad for the project’s users if bundled >> implementations were of uneven quality or supported subsets of >> functionality. Converging on a common validation toolchain for pluggable >> subsystems can help us ensure that quality while minimizing the effort >> required to test new implementations. >> >> Finally, I think it's important we work to maintain trunk in a shippable >> state. This might look like major changes and new features hiding behind >> feature flags that enable users to selectively enable them as development >> and validation proceeds, with new code executed regardless of the flag held >> to a higher standard. >> >> Cheers, >> >> – Scott >> >> [1] >> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html >> >> >> ________________________________________ >> From: guo Maxwell <cclive1...@gmail.com> >> Sent: Wednesday, April 14, 2021 10:25 PM >> To: dev@cassandra.apache.org >> Subject: Re: [DISCUSSION] Next release roadmap >> >> +1 >> >> Brandon Williams <dri...@gmail.com> 于2021年4月15日周四 上午4:48写道: >> >> > Agreed. Everyone just please keep in mind this thread is for roadmap >> > contributions you plan to make, not contributions you would like to >> > see. >> > >> > On Wed, Apr 14, 2021 at 3:45 PM Nate McCall <zznat...@gmail.com> wrote: >> > > >> > > Agree with Stefan 100% on this. We need to move towards pluggability. >> Our >> > > users are asking for it, it makes sense architecturally, and people >> are >> > > doing it anyway. >> > > >> > > >> > > ... >> > > > for me definitely >> https://issues.apache.org/jira/browse/CASSANDRA-9633 >> > > > >> > > > I am surprised nobody mentioned this in the previous answers, there >> is >> > > > ~50 people waiting for it to happen and multiple people working on >> it >> > > > seriously and wanting that feature to be there for so so long. >> > > > ... >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> > For additional commands, e-mail: dev-h...@cassandra.apache.org >> > >> > >> >> -- >> you are the apple of my eye ! >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >> For additional commands, e-mail: dev-h...@cassandra.apache.org >> >> --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org