Using TCM to distribute this information across the cluster vs. using some other LWT-ish distributed CP solution higher in the stack should effectively have the same UX guarantees to us and our users right? So I think it's still quite viable, even if we're just LWT'ing things into distributed tables, doing something silly like CL_ALL, etc.
On Wed, Jan 29, 2025, at 5:44 AM, Štefan Miklošovič wrote: > I want to ask about this ticket in particular, I know I am somehow hijacking > this thread but taking recent discussion into account where we kind of > rejected the idea of using TCM log for storing configuration, what does this > mean for tickets like this? Is this still viable or we need to completely > diverge from this approach and figure out something else? > > Thanks > > (1) https://issues.apache.org/jira/browse/CASSANDRA-19130 > > On Tue, Jan 7, 2025 at 1:04 PM Štefan Miklošovič <smikloso...@apache.org> > wrote: >> It would be cool if it was acting like this, then the whole plugin would >> become irrelevant when it comes to the migrations. >> >> https://github.com/instaclustr/cassandra-everywhere-strategy >> https://github.com/instaclustr/cassandra-everywhere-strategy?tab=readme-ov-file#motivation >> >> On Mon, Jan 6, 2025 at 11:09 PM Jon Haddad <j...@rustyrazorblade.com> wrote: >>> What about finally adding a much desired EverywhereStrategy? It wouldn't >>> just be useful for config - system_auth bites a lot of people today. >>> >>> As much as I don't like to suggest row cache, it might be a good fit here >>> as well. We could remove the custom code around auth cache in the process. >>> >>> Jon >>> >>> On Mon, Jan 6, 2025 at 12:48 PM Benedict Elliott Smith >>> <bened...@apache.org> wrote: >>>> The more we talk about this, the more my position crystallises against >>>> this approach. The feature we’re discussing here should be easy to >>>> implement on top of user facing functionality; we aren’t the only people >>>> who want functionality like this. We should be dogfooding our own UX for >>>> this kind of capability. >>>> >>>> TCM is unique in that it *cannot* dogfood the database. As a result is is >>>> not only critical for correctness, it’s also more complex - and >>>> inefficient - than a native database feature could be. It’s the worst of >>>> both worlds: we couple critical functionality to non-critical features, >>>> and couple those non-critical features to more complex logic than they >>>> need. >>>> >>>> My vote would be to introduce a new table feature that provides a >>>> node-local time bounded cache, so that you can safely perform CL.ONE >>>> queries against it, and let the whole world use it. >>>> >>>> >>>>> On 6 Jan 2025, at 18:23, Blake Eggleston <beggles...@apple.com> wrote: >>>>> >>>>>>>>>> TCM was designed with a couple of very specific correctness-critical >>>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend. >>>>> >>>>> Its initial scope was for those use cases, but it’s potential for >>>>> enabling more sophisticated functionality was one of its selling points >>>>> and is listed in the CEP. >>>>> >>>>>> Folks transitively breaking cluster membership by accidentally breaking >>>>>> the shared dependency of a non-critical feature is a risk I don’t like >>>>>> much. >>>>> >>>>> Having multiple distributed config systems operating independently is >>>>> going to create it’s own set of problems, especially if the distributed >>>>> config has any level of interaction with schema or topology. >>>>> >>>>> I lean towards distributed config going into TCM, although a more >>>>> friendly api for extension that offers some guardrails would be a good >>>>> idea. >>>>> >>>>>> On Jan 6, 2025, at 9:21 AM, Aleksey Yeshchenko <alek...@apple.com> wrote: >>>>>> >>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have a >>>>>>> good mental model on its properties, so i assumed that it could be used >>>>>>> to disseminate arbitrary key value pairs like config fairly easily. >>>>>> >>>>>> It’s more than *capable* of disseminating arbitrary-ish key-value pairs >>>>>> - it can deal with schema after all. >>>>>> >>>>>> I claim it to be *unsuitable* because of the coupling it would introduce >>>>>> between components of different levels of criticality. You can derisk it >>>>>> partially by having separate logs (which might not be trivial to >>>>>> implement). But unless you also duplicate all the TCM logic in some >>>>>> other package, the shared code dependency coupling persists. Folks >>>>>> transitively breaking cluster membership by accidentally breaking the >>>>>> shared dependency of a non-critical feature is a risk I don’t like much. >>>>>> Keep it tight, single-purpose, let it harden over time without being >>>>>> disrupted. >>>>>> >>>>>>> On 6 Jan 2025, at 16:54, Aleksey Yeshchenko <alek...@apple.com> wrote: >>>>>>> >>>>>>> I agree that this would be useful, yes. >>>>>>> >>>>>>> An LWT/Accord variant plus a plain writes eventually consistent >>>>>>> variant. A generic-by-design internal-only per-table mechanism with >>>>>>> optional caching + optional write notifications issued to non-replicas. >>>>>>> >>>>>>>> On 6 Jan 2025, at 14:26, Josh McKenzie <jmcken...@apache.org> wrote: >>>>>>>> >>>>>>>>> I think if we go down the route of pushing configs around with LWT + >>>>>>>>> caching instead, we should have that be a generic system that is >>>>>>>>> designed for everyone to use. >>>>>>>> Agreed. Otherwise we end up with the same problem Aleksey's speaking >>>>>>>> about above, where we build something for a specific purpose and then >>>>>>>> maintainers in the future with a reasonable need extend or bend it to >>>>>>>> fit their new need, risking destabilizing the original implementation. >>>>>>>> >>>>>>>> Better to have a solid shared primitive other features can build upon. >>>>>>>> >>>>>>>> On Mon, Jan 6, 2025, at 8:33 AM, Jon Haddad wrote: >>>>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have >>>>>>>>> a good mental model on its properties, so i assumed that it could be >>>>>>>>> used to disseminate arbitrary key value pairs like config fairly >>>>>>>>> easily. >>>>>>>>> >>>>>>>>> Somewhat humorously, i think that same assumption was made when >>>>>>>>> putting sai metadata into gossip which caused a cluster with 800 2i >>>>>>>>> to break it. >>>>>>>>> >>>>>>>>> I think if we go down the route of pushing configs around with LWT + >>>>>>>>> caching instead, we should have that be a generic system that is >>>>>>>>> designed for everyone to use. Then we have a gossip replacement, >>>>>>>>> reduce config clutter, and people have something that can be used >>>>>>>>> without adding another bespoke system into the mix. >>>>>>>>> >>>>>>>>> Jon >>>>>>>>> >>>>>>>>> On Mon, Jan 6, 2025 at 6:48 AM Aleksey Yeshchenko <alek...@apple.com> >>>>>>>>> wrote: >>>>>>>>>> TCM was designed with a couple of very specific correctness-critical >>>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend. >>>>>>>>>> >>>>>>>>>> It might be *convenient* to employ TCM for some other features, >>>>>>>>>> which makes it tempting to abuse TCM for an unintended purpose, but >>>>>>>>>> we shouldn’t do what's convenient over what is right. There are >>>>>>>>>> several ways this often goes wrong. >>>>>>>>>> >>>>>>>>>> For example, the sybsystem gets used as is, without modification, by >>>>>>>>>> a new feature, but in ways that invalidate the assumptions behind >>>>>>>>>> the design of the subsystem - designed for particular use cases. >>>>>>>>>> >>>>>>>>>> For another example, the subsystem *almost* works as is for the new >>>>>>>>>> feature, but doesn't *quite* work as is, so changes are made to it, >>>>>>>>>> and reviewed, by someone not familiar enough with the subsystem >>>>>>>>>> design and implementation. One of such changes eventually introduces >>>>>>>>>> a bug to the shared critical subsystem, and now everyone is having a >>>>>>>>>> bad time. >>>>>>>>>> >>>>>>>>>> The risks are real, and I’d strongly prefer that we didn’t co-opt a >>>>>>>>>> critical subsystem for a non-critical use-case for this reason alone. >>>>>>>>>> >>>>>>>>>>> On 21 Dec 2024, at 23:18, Jordan West <jorda...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>> I tend to lean towards Josh's perspective. Gossip was poorly tested >>>>>>>>>>> and implemented. I dont think it's a good parallel or at least I >>>>>>>>>>> hope it's not. Taken to the extreme we shouldn't touch the database >>>>>>>>>>> at all otherwise, which isn't practical. That said, anything >>>>>>>>>>> touching important subsystems needs more care, testing, and time to >>>>>>>>>>> bake. I think we're mostly discussing "being careful" of which I am >>>>>>>>>>> totally on board with. I don't think Benedict ever said "don't use >>>>>>>>>>> TCM", in fact he's said the opposite, but emphasized the care that >>>>>>>>>>> is required when we do, which is totally reasonable. >>>>>>>>>>> >>>>>>>>>>> Back to capabilities, Riak built them on an eventually consistent >>>>>>>>>>> subsystem and they worked fine. If you have a split brain you >>>>>>>>>>> likely dont want to communicate agreement as is (or have already >>>>>>>>>>> learned about agreement and its not an issue). That said, I don't >>>>>>>>>>> think we have an EC layer in C* I would want to rely on outside of >>>>>>>>>>> distributed tables. So in the context of what we have existing I >>>>>>>>>>> think TCM is a better fit. I still need to dig a little more to be >>>>>>>>>>> convinced and plan to do that as I draft the CEP. >>>>>>>>>>> >>>>>>>>>>> Jordan >>>>>>>>>>> >>>>>>>>>>> On Sat, Dec 21, 2024 at 5:51 AM Benedict <bened...@apache.org> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> I’m not saying we need to tease out bugs from TCM. I’m saying >>>>>>>>>>>> every time someone touches something this central to correctness >>>>>>>>>>>> we introduce a risk of breaking it, and that we should exercise >>>>>>>>>>>> that risk judiciously. This has zero to do with the amount of data >>>>>>>>>>>> we’re pushing through it, and 100% to do with writing bad code. >>>>>>>>>>>> >>>>>>>>>>>> We treated gossip carefully in part because it was hard to work >>>>>>>>>>>> with, but in part because getting it wrong was particularly bad. >>>>>>>>>>>> We should retain the latter reason for caution. >>>>>>>>>>>> >>>>>>>>>>>> We also absolutely do not need TCM for consistency. We have >>>>>>>>>>>> consistent database functionality for that. TCM is special because >>>>>>>>>>>> it cannot rely on the database mechanisms, as it underpins them. >>>>>>>>>>>> That is the whole point of why we should treat it carefully. >>>>>>>>>>>> >>>>>>>>>>>>> On 21 Dec 2024, at 13:43, Josh McKenzie <jmcken...@apache.org> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> To play the devil's advocate - the more we exercise TCM the more >>>>>>>>>>>>> bugs we suss out. To Jon's point, the volume of information we're >>>>>>>>>>>>> talking about here in terms of capabilities dissemination >>>>>>>>>>>>> shouldn't stress TCM at all. >>>>>>>>>>>>> >>>>>>>>>>>>> I think a reasonable heuristic for relying on TCM for something >>>>>>>>>>>>> is whether there's a big difference in UX on something being >>>>>>>>>>>>> eventually consistent vs. strongly consistent. Exposing features >>>>>>>>>>>>> to clients based on whether the entire cluster supports them >>>>>>>>>>>>> seems like the kind of thing that could cause pain if we're in a >>>>>>>>>>>>> split-brain, cluster-is-settling-on-agreement kind of paradigm. >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Dec 20, 2024, at 3:17 PM, Benedict wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Mostly conceptual; the problem with a linearizable history is >>>>>>>>>>>>>> that if you lose some of it (eg because some logic bug prevents >>>>>>>>>>>>>> you from processing some epoch) you stop the world until an >>>>>>>>>>>>>> operator can step in to perform surgery about what the history >>>>>>>>>>>>>> should be. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I do know of one recent bug to schema changes in cep-15 that >>>>>>>>>>>>>> broke TCM in this way. That particular avenue will be hardened, >>>>>>>>>>>>>> but the fewer places we risk this the better IMO. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Of course, there are steps we could take to expose a limited API >>>>>>>>>>>>>> targeting these use cases, as well as using a separate log for >>>>>>>>>>>>>> ancillary functionality, that might better balance risk:reward. >>>>>>>>>>>>>> But equally I’m not sure it makes sense to TCM all the things, >>>>>>>>>>>>>> and maybe dogfooding our own database features and developing >>>>>>>>>>>>>> functionality that enables our own use cases could be better >>>>>>>>>>>>>> where it isn’t necessary 🤷♀️ >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 20 Dec 2024, at 19:22, Jordan West <jorda...@gmail.com> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 11:06 AM Benedict <bened...@apache.org> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If TCM breaks we all have a really bad time, much worse than >>>>>>>>>>>>>>>> if any one of these features individually has problems. If you >>>>>>>>>>>>>>>> break TCM in the right way the cluster could become >>>>>>>>>>>>>>>> inoperable, or operations like topology changes may be >>>>>>>>>>>>>>>> prevented. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Benedict, when you say this are you speaking hypothetically (in >>>>>>>>>>>>>>> the sense that by using TCM more we increase the probability of >>>>>>>>>>>>>>> using it "wrong" and hitting an unknown edge case) or are there >>>>>>>>>>>>>>> known ways today that TCM "breaks"? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Jordan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This means that even a parallel log has some risk if we end up >>>>>>>>>>>>>>>> modifying shared functionality. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 20 Dec 2024, at 18:47, Štefan Miklošovič >>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I stand corrected. C in TCM is "cluster" :D Anyway. >>>>>>>>>>>>>>>>> Configuration is super reasonable to be put there. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:42 PM Štefan Miklošovič >>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>>>>>>>> I am super hesitant to base distributed guardrails or any >>>>>>>>>>>>>>>>>> configuration for that matter on anything but TCM. Does not >>>>>>>>>>>>>>>>>> "C" in TCM stand for "configuration" anyway? So rename it to >>>>>>>>>>>>>>>>>> TSM like "schema" then if it is meant to be just for that. >>>>>>>>>>>>>>>>>> It seems to be quite ridiculous to code tables with caches >>>>>>>>>>>>>>>>>> on top when we have way more effective tooling thanks to >>>>>>>>>>>>>>>>>> CEP-21 to deal with that with clear advantages of getting >>>>>>>>>>>>>>>>>> rid of all of that old mechanism we have in place. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I have not seen any concrete examples of risks why using TCM >>>>>>>>>>>>>>>>>> should be just for what it is currently for. Why not put the >>>>>>>>>>>>>>>>>> configuration meant to be cluster-wide into that? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> What is it ... performance? What does even the term >>>>>>>>>>>>>>>>>> "additional complexity" mean? Complex in what? Do you think >>>>>>>>>>>>>>>>>> that putting there 3 types of transformations in case of >>>>>>>>>>>>>>>>>> guardrails which flip some booleans and numbers would >>>>>>>>>>>>>>>>>> suddenly make TCM way more complex? Come on ... >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This has nothing to do with what Jordan is trying to >>>>>>>>>>>>>>>>>> introduce. I think we all agree he knows what he is doing >>>>>>>>>>>>>>>>>> and if he evaluates that TCM is too much for his use case >>>>>>>>>>>>>>>>>> (or it is not a good fit) that is perfectly fine. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:22 PM Paulo Motta >>>>>>>>>>>>>>>>>> <pa...@apache.org> wrote: >>>>>>>>>>>>>>>>>>> > It should be possible to use distributed system tables >>>>>>>>>>>>>>>>>>> > just fine for capabilities, config and guardrails. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I have been thinking about this recently and I agree we >>>>>>>>>>>>>>>>>>> should be wary about introducing new TCM states and create >>>>>>>>>>>>>>>>>>> additional complexity that can be serviced by existing data >>>>>>>>>>>>>>>>>>> dissemination mechanisms (gossip/system tables). I would >>>>>>>>>>>>>>>>>>> prefer that we take a more phased and incremental approach >>>>>>>>>>>>>>>>>>> to introduce new TCM states. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As a way to accomplish that, I have thought about >>>>>>>>>>>>>>>>>>> introducing a new generic TCM state "In Maintenance", where >>>>>>>>>>>>>>>>>>> schema or membership changes are "frozen/disallowed" while >>>>>>>>>>>>>>>>>>> an external operation is taking place. This "external >>>>>>>>>>>>>>>>>>> operation" could mean many things: >>>>>>>>>>>>>>>>>>> - Upgrade >>>>>>>>>>>>>>>>>>> - Downgrade >>>>>>>>>>>>>>>>>>> - Migration >>>>>>>>>>>>>>>>>>> - Capability Enablement/Disablement >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> These could be sub-states of the "Maintenance" TCM state, >>>>>>>>>>>>>>>>>>> that could be managed externally (via cache/gossip/system >>>>>>>>>>>>>>>>>>> tables/sidecar). Once these sub-states are validated >>>>>>>>>>>>>>>>>>> thouroughly and mature enough, we could "promote" them to >>>>>>>>>>>>>>>>>>> top-level TCM states. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> In the end what really matters is that cluster and schema >>>>>>>>>>>>>>>>>>> membership changes do not happen while a miscellaneous >>>>>>>>>>>>>>>>>>> operation is taking place. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Would this make sense as an initial way to integrate TCM >>>>>>>>>>>>>>>>>>> with capabilities framework ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 4:53 AM Benedict >>>>>>>>>>>>>>>>>>> <bened...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> If you perform a read from a distributed table on startup >>>>>>>>>>>>>>>>>>>> you will find the latest information. What catchup are you >>>>>>>>>>>>>>>>>>>> thinking of? I don’t think any of the features we talked >>>>>>>>>>>>>>>>>>>> about need a log, only the latest information. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> We can (and should) probably introduce event listeners for >>>>>>>>>>>>>>>>>>>> distributed tables, as this is also a really great >>>>>>>>>>>>>>>>>>>> feature, but I don’t think this should be necessary here. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding disagreements: if you use LWTs then there are no >>>>>>>>>>>>>>>>>>>> consistency issues to worry about. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Again, I’m not opposed to using TCM, although I am a >>>>>>>>>>>>>>>>>>>> little worried TCM is becoming our new hammer with >>>>>>>>>>>>>>>>>>>> everything a nail. It would be better IMO to keep TCM >>>>>>>>>>>>>>>>>>>> scoped to essential functionality as it’s critical to >>>>>>>>>>>>>>>>>>>> correctness. Perhaps we could extend its APIs to less >>>>>>>>>>>>>>>>>>>> critical services without intertwining them with >>>>>>>>>>>>>>>>>>>> membership, schema and epoch handling. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 20 Dec 2024, at 09:43, Štefan Miklošovič >>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I find TCM way more comfortable to work with. The >>>>>>>>>>>>>>>>>>>>> capability of log being replayed on restart and catching >>>>>>>>>>>>>>>>>>>>> up with everything else automatically is god-sent. If we >>>>>>>>>>>>>>>>>>>>> had that on "good old distributed tables", then is it not >>>>>>>>>>>>>>>>>>>>> true that we would need to take extra care of that, e.g. >>>>>>>>>>>>>>>>>>>>> we would need to repair it etc ... It might be the source >>>>>>>>>>>>>>>>>>>>> of the discrepancies / disagreements etc. TCM is just >>>>>>>>>>>>>>>>>>>>> "maintenance-free" and _just works_. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I think I was also investigating distributed tables but >>>>>>>>>>>>>>>>>>>>> was just pulled towards TCM naturally because of its >>>>>>>>>>>>>>>>>>>>> goodies. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 10:08 AM Benedict >>>>>>>>>>>>>>>>>>>>> <bened...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> TCM is a perfectly valid basis for this, but TCM is only >>>>>>>>>>>>>>>>>>>>>> really *necessary* to solve meta config problems where >>>>>>>>>>>>>>>>>>>>>> we can’t rely on the rest of the database working. >>>>>>>>>>>>>>>>>>>>>> Particularly placement issues, which is why schema and >>>>>>>>>>>>>>>>>>>>>> membership need to live there. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> It should be possible to use distributed system tables >>>>>>>>>>>>>>>>>>>>>> just fine for capabilities, config and guardrails. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> That said, it’s possible config might be better >>>>>>>>>>>>>>>>>>>>>> represented as part of the schema (and we already store >>>>>>>>>>>>>>>>>>>>>> some relevant config there) in which case it would live >>>>>>>>>>>>>>>>>>>>>> in TCM automatically. Migrating existing configs to a >>>>>>>>>>>>>>>>>>>>>> distributed setup will be fun however we do it though. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Capabilities also feel naturally related to other >>>>>>>>>>>>>>>>>>>>>> membership information, so TCM might be the most >>>>>>>>>>>>>>>>>>>>>> suitable place, particularly for handling downgrades >>>>>>>>>>>>>>>>>>>>>> after capabilities have been enabled (if we ever expect >>>>>>>>>>>>>>>>>>>>>> to support turning off capabilities and then downgrading >>>>>>>>>>>>>>>>>>>>>> - which today we mostly don’t). >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On 20 Dec 2024, at 08:42, Štefan Miklošovič >>>>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Jordan, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I also think that having it on TCM would be ideal and >>>>>>>>>>>>>>>>>>>>>>> we should explore this path first before doing anything >>>>>>>>>>>>>>>>>>>>>>> custom. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Regarding my idea about the guardrails in TCM, when I >>>>>>>>>>>>>>>>>>>>>>> prototyped that and wanted to make it happen, there was >>>>>>>>>>>>>>>>>>>>>>> a little bit of a pushback (1) (even though super >>>>>>>>>>>>>>>>>>>>>>> reasonable one) that TCM is just too young at the >>>>>>>>>>>>>>>>>>>>>>> moment and it would be desirable to go through some >>>>>>>>>>>>>>>>>>>>>>> stabilisation period. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Another idea was that we should not make just >>>>>>>>>>>>>>>>>>>>>>> guardrails happen but the whole config should be in >>>>>>>>>>>>>>>>>>>>>>> TCM. From what I put together, Sam / Alex does not seem >>>>>>>>>>>>>>>>>>>>>>> to be opposed to this idea, rather the opposite, but >>>>>>>>>>>>>>>>>>>>>>> having CEP about that is way more involved than having >>>>>>>>>>>>>>>>>>>>>>> just guardrails there. I consider guardrails to be kind >>>>>>>>>>>>>>>>>>>>>>> of special and I do not think that having all >>>>>>>>>>>>>>>>>>>>>>> configurations in TCM (which guardrails are part of) is >>>>>>>>>>>>>>>>>>>>>>> the absolute must in order to deliver that. I may start >>>>>>>>>>>>>>>>>>>>>>> with guardrails CEP and you may explore Capabilities >>>>>>>>>>>>>>>>>>>>>>> CEP on TCM too, if that makes sense? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I just wanted to raise the point about the time this >>>>>>>>>>>>>>>>>>>>>>> would be delivered. If Capabilities are built on TCM >>>>>>>>>>>>>>>>>>>>>>> and I wanted to do Guardrails on TCM too but was >>>>>>>>>>>>>>>>>>>>>>> explained it is probably too soon, I guess you would >>>>>>>>>>>>>>>>>>>>>>> experience something similar. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Sam's comment is from May and maybe a lot has changed >>>>>>>>>>>>>>>>>>>>>>> since in then and his comment is not applicable >>>>>>>>>>>>>>>>>>>>>>> anymore. It would be great to know if we could build on >>>>>>>>>>>>>>>>>>>>>>> top of the current trunk already or we will wait until >>>>>>>>>>>>>>>>>>>>>>> 5.1/6.0 is delivered. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-19593?focusedCommentId=17844326&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17844326 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 2:17 AM Jordan West >>>>>>>>>>>>>>>>>>>>>>> <jorda...@gmail.com> wrote: >>>>>>>>>>>>>>>>>>>>>>>> Firstly, glad to see the support and enthusiasm here >>>>>>>>>>>>>>>>>>>>>>>> and in the recent Slack discussion. I think there is >>>>>>>>>>>>>>>>>>>>>>>> enough for me to start drafting a CEP. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Stefan, global configuration and capabilities do have >>>>>>>>>>>>>>>>>>>>>>>> some overlap but not full overlap. For example, you >>>>>>>>>>>>>>>>>>>>>>>> may want to set globally that a cluster enables >>>>>>>>>>>>>>>>>>>>>>>> feature X or control the threshold for a guardrail but >>>>>>>>>>>>>>>>>>>>>>>> you still need to know if all nodes support feature X >>>>>>>>>>>>>>>>>>>>>>>> or have that guardrail, the latter is what >>>>>>>>>>>>>>>>>>>>>>>> capabilities targets. I do think capabilities are a >>>>>>>>>>>>>>>>>>>>>>>> step towards supporting global configuration and the >>>>>>>>>>>>>>>>>>>>>>>> work you described is another step (that we could do >>>>>>>>>>>>>>>>>>>>>>>> after capabilities or in parallel with them in mind). >>>>>>>>>>>>>>>>>>>>>>>> I am also supportive of exploring global configuration >>>>>>>>>>>>>>>>>>>>>>>> for the reasons you mentioned. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> In terms of how capabilities get propagated across the >>>>>>>>>>>>>>>>>>>>>>>> cluster, I hadn't put much thought into it yet past >>>>>>>>>>>>>>>>>>>>>>>> likely TCM since this will be a new feature that lands >>>>>>>>>>>>>>>>>>>>>>>> after TCM. In Riak, we had gossip (but more mature >>>>>>>>>>>>>>>>>>>>>>>> than C*s -- this was an area I contributed to a lot so >>>>>>>>>>>>>>>>>>>>>>>> very familiar) to disseminate less critical >>>>>>>>>>>>>>>>>>>>>>>> information such as capabilities and a separate layer >>>>>>>>>>>>>>>>>>>>>>>> that did TCM. Since we don't have this in C* I don't >>>>>>>>>>>>>>>>>>>>>>>> think we would want to build a separate distribution >>>>>>>>>>>>>>>>>>>>>>>> channel for capabilities metadata when we already have >>>>>>>>>>>>>>>>>>>>>>>> TCM in place. But I plan to explore this more as I >>>>>>>>>>>>>>>>>>>>>>>> draft the CEP. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Jordan >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 1:48 PM Štefan Miklošovič >>>>>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> Hi Jordan, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> what would this look like from the implementation >>>>>>>>>>>>>>>>>>>>>>>>> perspective? I was experimenting with transactional >>>>>>>>>>>>>>>>>>>>>>>>> guardrails where an operator would control the >>>>>>>>>>>>>>>>>>>>>>>>> content of a virtual table which would be backed by >>>>>>>>>>>>>>>>>>>>>>>>> TCM so whatever guardrail we would change, this would >>>>>>>>>>>>>>>>>>>>>>>>> be automatically and transparently propagated to >>>>>>>>>>>>>>>>>>>>>>>>> every node in a cluster. The POC worked quite nicely. >>>>>>>>>>>>>>>>>>>>>>>>> TCM is just a vehicle to commit a change which would >>>>>>>>>>>>>>>>>>>>>>>>> spread around and all these settings would survive >>>>>>>>>>>>>>>>>>>>>>>>> restarts. We would have the same configuration >>>>>>>>>>>>>>>>>>>>>>>>> everywhere which is not currently the case because >>>>>>>>>>>>>>>>>>>>>>>>> guardrails are configured per node and if not >>>>>>>>>>>>>>>>>>>>>>>>> persisted to yaml, on restart their values would be >>>>>>>>>>>>>>>>>>>>>>>>> forgotten. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Guardrails are just an example, what is quite obvious >>>>>>>>>>>>>>>>>>>>>>>>> is to expand this idea to the whole configuration in >>>>>>>>>>>>>>>>>>>>>>>>> yaml. Of course, not all properties in yaml make >>>>>>>>>>>>>>>>>>>>>>>>> sense to be the same cluster-wise (ip addresses etc >>>>>>>>>>>>>>>>>>>>>>>>> ...), but the ones which do would be again set >>>>>>>>>>>>>>>>>>>>>>>>> everywhere the same way. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> The approach I described above is that we make sure >>>>>>>>>>>>>>>>>>>>>>>>> that the configuration is same everywhere, hence >>>>>>>>>>>>>>>>>>>>>>>>> there can be no misunderstanding what features this >>>>>>>>>>>>>>>>>>>>>>>>> or that node has, if we say that all nodes have to >>>>>>>>>>>>>>>>>>>>>>>>> have a particular feature because we said so in TCM >>>>>>>>>>>>>>>>>>>>>>>>> log so on restart / replay, a node with "catch up" >>>>>>>>>>>>>>>>>>>>>>>>> with whatever features it is asked to turn on. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Your approach seems to be that we distribute what all >>>>>>>>>>>>>>>>>>>>>>>>> capabilities / features a cluster supports and that >>>>>>>>>>>>>>>>>>>>>>>>> each individual node configures itself in some way or >>>>>>>>>>>>>>>>>>>>>>>>> not to comply? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Is there any intersection in these approaches? At >>>>>>>>>>>>>>>>>>>>>>>>> first sight it seems somehow related. How is one >>>>>>>>>>>>>>>>>>>>>>>>> different from another from your point of view? >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-19593 >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 12:00 AM Jordan West >>>>>>>>>>>>>>>>>>>>>>>>> <jw...@apache.org> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> In a recent discussion on the pains of upgrading one >>>>>>>>>>>>>>>>>>>>>>>>>> topic that came up is a feature that Riak had called >>>>>>>>>>>>>>>>>>>>>>>>>> Capabilities [1]. A major pain with upgrades is that >>>>>>>>>>>>>>>>>>>>>>>>>> each node independently decides when to start using >>>>>>>>>>>>>>>>>>>>>>>>>> new or modified functionality. Even when we put this >>>>>>>>>>>>>>>>>>>>>>>>>> behind a config (like storage compatibility mode) >>>>>>>>>>>>>>>>>>>>>>>>>> each node immediately enables the feature when the >>>>>>>>>>>>>>>>>>>>>>>>>> config is changed and the node is restarted. This >>>>>>>>>>>>>>>>>>>>>>>>>> causes various types of upgrade pain such as failed >>>>>>>>>>>>>>>>>>>>>>>>>> streams and schema disagreement. A recent example of >>>>>>>>>>>>>>>>>>>>>>>>>> this is CASSANRA-20118 [2]. In some cases operators >>>>>>>>>>>>>>>>>>>>>>>>>> can prevent this from happening through careful >>>>>>>>>>>>>>>>>>>>>>>>>> coordination (e.g. ensuring upgrade sstables only >>>>>>>>>>>>>>>>>>>>>>>>>> runs after the whole cluster is upgraded) but >>>>>>>>>>>>>>>>>>>>>>>>>> typically requires custom code in whatever control >>>>>>>>>>>>>>>>>>>>>>>>>> plane the operator is using. A capabilities >>>>>>>>>>>>>>>>>>>>>>>>>> framework would distribute the state of what >>>>>>>>>>>>>>>>>>>>>>>>>> features each node has (and their status e.g. >>>>>>>>>>>>>>>>>>>>>>>>>> enabled or not) so that the cluster can choose to >>>>>>>>>>>>>>>>>>>>>>>>>> opt in to new features once the whole cluster has >>>>>>>>>>>>>>>>>>>>>>>>>> them available. From experience, having this in Riak >>>>>>>>>>>>>>>>>>>>>>>>>> made upgrades a significantly less risky process and >>>>>>>>>>>>>>>>>>>>>>>>>> also paved a path towards repeatable downgrades. I >>>>>>>>>>>>>>>>>>>>>>>>>> think Cassandra would benefit from it as well. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Further, other tools like analytics could benefit >>>>>>>>>>>>>>>>>>>>>>>>>> from having this information since currently it's up >>>>>>>>>>>>>>>>>>>>>>>>>> to the operator to manually determine the state of >>>>>>>>>>>>>>>>>>>>>>>>>> the cluster in some cases. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I am considering drafting a CEP proposal for this >>>>>>>>>>>>>>>>>>>>>>>>>> feature but wanted to take the general temperature >>>>>>>>>>>>>>>>>>>>>>>>>> of the community and get some early thoughts while >>>>>>>>>>>>>>>>>>>>>>>>>> working on the draft. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to hearing y'alls thoughts, >>>>>>>>>>>>>>>>>>>>>>>>>> Jordan >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/basho/riak_core/blob/25d9a6fa917eb8a2e95795d64eb88d7ad384ed88/src/riak_core_capability.erl#L23-L72 >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> [2] >>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-20118