One motivating case for TCM vs non-TCM… When accord processes the user request, we can make sure to use the configs as they were at the execution epoch… by splitting this out it makes all configs non-deterministic from Accord’s point of view…
Simple example, lets say you add a global config that says you can’t write more than X bytes, when this is outside of TCM accord can have multiple different values while executing the query (assuming a user changed it)… > On Jan 29, 2025, at 11:04 AM, Paulo Motta <pa...@apache.org> wrote: > > > Using TCM to distribute this information across the cluster vs. using some > > other LWT-ish distributed CP solution higher in the stack should > > effectively have the same UX guarantees to us and our users right? So I > > think it's still quite viable, even if we're just LWT'ing things into > > distributed tables, doing something silly like CL_ALL, etc. > > +1, can we modularize/encapsulate the storage/dissemination backend needed > for these features so they are pluggable ? > > I don't think either global configuration or capabilities should be tied to > the underlying storage/dissemination mechanism, this feels like an > implementation detail. > > Ideally if this is well modularized, we can always plugin or replace it with > other backends (TCM/UDP/S3/morse code/whatever) once this is functional. > > On Wed, Jan 29, 2025 at 1:17 PM David Capwell <dcapw...@apple.com > <mailto:dcapw...@apple.com>> wrote: >> To be explicit about my concerns in the previous comments… >> >> TCM vs new table, I don’t care too much. I prefer TCM over new table, but >> its a preference >> >> My comment before were more about the UX of global configs. As long as we >> “could” (maybe per config, not every config likely needs this) allow local >> tmp overrides, then my concerns are kinda addressed. >> >>> On Jan 29, 2025, at 7:59 AM, Josh McKenzie <jmcken...@apache.org >>> <mailto:jmcken...@apache.org>> wrote: >>> >>> Using TCM to distribute this information across the cluster vs. using some >>> other LWT-ish distributed CP solution higher in the stack should >>> effectively have the same UX guarantees to us and our users right? So I >>> think it's still quite viable, even if we're just LWT'ing things into >>> distributed tables, doing something silly like CL_ALL, etc. >>> >>> On Wed, Jan 29, 2025, at 5:44 AM, Štefan Miklošovič wrote: >>>> I want to ask about this ticket in particular, I know I am somehow >>>> hijacking this thread but taking recent discussion into account where we >>>> kind of rejected the idea of using TCM log for storing configuration, what >>>> does this mean for tickets like this? Is this still viable or we need to >>>> completely diverge from this approach and figure out something else? >>>> >>>> Thanks >>>> >>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-19130 >>>> >>>> On Tue, Jan 7, 2025 at 1:04 PM Štefan Miklošovič <smikloso...@apache.org >>>> <mailto:smikloso...@apache.org>> wrote: >>>> It would be cool if it was acting like this, then the whole plugin would >>>> become irrelevant when it comes to the migrations. >>>> >>>> https://github.com/instaclustr/cassandra-everywhere-strategy >>>> https://github.com/instaclustr/cassandra-everywhere-strategy?tab=readme-ov-file#motivation >>>> >>>> On Mon, Jan 6, 2025 at 11:09 PM Jon Haddad <j...@rustyrazorblade.com >>>> <mailto:j...@rustyrazorblade.com>> wrote: >>>> What about finally adding a much desired EverywhereStrategy? It wouldn't >>>> just be useful for config - system_auth bites a lot of people today. >>>> >>>> As much as I don't like to suggest row cache, it might be a good fit here >>>> as well. We could remove the custom code around auth cache in the process. >>>> >>>> Jon >>>> >>>> On Mon, Jan 6, 2025 at 12:48 PM Benedict Elliott Smith >>>> <bened...@apache.org <mailto:bened...@apache.org>> wrote: >>>> The more we talk about this, the more my position crystallises against >>>> this approach. The feature we’re discussing here should be easy to >>>> implement on top of user facing functionality; we aren’t the only people >>>> who want functionality like this. We should be dogfooding our own UX for >>>> this kind of capability. >>>> >>>> TCM is unique in that it cannot dogfood the database. As a result is is >>>> not only critical for correctness, it’s also more complex - and >>>> inefficient - than a native database feature could be. It’s the worst of >>>> both worlds: we couple critical functionality to non-critical features, >>>> and couple those non-critical features to more complex logic than they >>>> need. >>>> >>>> My vote would be to introduce a new table feature that provides a >>>> node-local time bounded cache, so that you can safely perform CL.ONE >>>> queries against it, and let the whole world use it. >>>> >>>> >>>>> On 6 Jan 2025, at 18:23, Blake Eggleston <beggles...@apple.com >>>>> <mailto:beggles...@apple.com>> wrote: >>>>> >>>>>>>>> TCM was designed with a couple of very specific correctness-critical >>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend. >>>>> >>>>> >>>>> Its initial scope was for those use cases, but it’s potential for >>>>> enabling more sophisticated functionality was one of its selling points >>>>> and is listed in the CEP. >>>>> >>>>>> Folks transitively breaking cluster membership by accidentally breaking >>>>>> the shared dependency of a non-critical feature is a risk I don’t like >>>>>> much. >>>>> >>>>> >>>>> Having multiple distributed config systems operating independently is >>>>> going to create it’s own set of problems, especially if the distributed >>>>> config has any level of interaction with schema or topology. >>>>> >>>>> I lean towards distributed config going into TCM, although a more >>>>> friendly api for extension that offers some guardrails would be a good >>>>> idea. >>>>> >>>>>> On Jan 6, 2025, at 9:21 AM, Aleksey Yeshchenko <alek...@apple.com >>>>>> <mailto:alek...@apple.com>> wrote: >>>>>> >>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have a >>>>>>> good mental model on its properties, so i assumed that it could be used >>>>>>> to disseminate arbitrary key value pairs like config fairly easily. >>>>>> >>>>>> It’s more than *capable* of disseminating arbitrary-ish key-value pairs >>>>>> - it can deal with schema after all. >>>>>> >>>>>> I claim it to be *unsuitable* because of the coupling it would introduce >>>>>> between components of different levels of criticality. You can derisk it >>>>>> partially by having separate logs (which might not be trivial to >>>>>> implement). But unless you also duplicate all the TCM logic in some >>>>>> other package, the shared code dependency coupling persists. Folks >>>>>> transitively breaking cluster membership by accidentally breaking the >>>>>> shared dependency of a non-critical feature is a risk I don’t like much. >>>>>> Keep it tight, single-purpose, let it harden over time without being >>>>>> disrupted. >>>>>> >>>>>>> On 6 Jan 2025, at 16:54, Aleksey Yeshchenko <alek...@apple.com >>>>>>> <mailto:alek...@apple.com>> wrote: >>>>>>> >>>>>>> I agree that this would be useful, yes. >>>>>>> >>>>>>> An LWT/Accord variant plus a plain writes eventually consistent >>>>>>> variant. A generic-by-design internal-only per-table mechanism with >>>>>>> optional caching + optional write notifications issued to non-replicas. >>>>>>> >>>>>>>> On 6 Jan 2025, at 14:26, Josh McKenzie <jmcken...@apache.org >>>>>>>> <mailto:jmcken...@apache.org>> wrote: >>>>>>>> >>>>>>>>> I think if we go down the route of pushing configs around with LWT + >>>>>>>>> caching instead, we should have that be a generic system that is >>>>>>>>> designed for everyone to use. >>>>>>>> Agreed. Otherwise we end up with the same problem Aleksey's speaking >>>>>>>> about above, where we build something for a specific purpose and then >>>>>>>> maintainers in the future with a reasonable need extend or bend it to >>>>>>>> fit their new need, risking destabilizing the original implementation. >>>>>>>> >>>>>>>> Better to have a solid shared primitive other features can build upon. >>>>>>>> >>>>>>>> On Mon, Jan 6, 2025, at 8:33 AM, Jon Haddad wrote: >>>>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have >>>>>>>>> a good mental model on its properties, so i assumed that it could be >>>>>>>>> used to disseminate arbitrary key value pairs like config fairly >>>>>>>>> easily. >>>>>>>>> >>>>>>>>> Somewhat humorously, i think that same assumption was made when >>>>>>>>> putting sai metadata into gossip which caused a cluster with 800 2i >>>>>>>>> to break it. >>>>>>>>> >>>>>>>>> I think if we go down the route of pushing configs around with LWT + >>>>>>>>> caching instead, we should have that be a generic system that is >>>>>>>>> designed for everyone to use. Then we have a gossip replacement, >>>>>>>>> reduce config clutter, and people have something that can be used >>>>>>>>> without adding another bespoke system into the mix. >>>>>>>>> >>>>>>>>> Jon >>>>>>>>> >>>>>>>>> On Mon, Jan 6, 2025 at 6:48 AM Aleksey Yeshchenko <alek...@apple.com >>>>>>>>> <mailto:alek...@apple.com>> wrote: >>>>>>>>> TCM was designed with a couple of very specific correctness-critical >>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend. >>>>>>>>> >>>>>>>>> It might be *convenient* to employ TCM for some other features, which >>>>>>>>> makes it tempting to abuse TCM for an unintended purpose, but we >>>>>>>>> shouldn’t do what's convenient over what is right. There are several >>>>>>>>> ways this often goes wrong. >>>>>>>>> >>>>>>>>> For example, the sybsystem gets used as is, without modification, by >>>>>>>>> a new feature, but in ways that invalidate the assumptions behind the >>>>>>>>> design of the subsystem - designed for particular use cases. >>>>>>>>> >>>>>>>>> For another example, the subsystem *almost* works as is for the new >>>>>>>>> feature, but doesn't *quite* work as is, so changes are made to it, >>>>>>>>> and reviewed, by someone not familiar enough with the subsystem >>>>>>>>> design and implementation. One of such changes eventually introduces >>>>>>>>> a bug to the shared critical subsystem, and now everyone is having a >>>>>>>>> bad time. >>>>>>>>> >>>>>>>>> The risks are real, and I’d strongly prefer that we didn’t co-opt a >>>>>>>>> critical subsystem for a non-critical use-case for this reason alone. >>>>>>>>> >>>>>>>>>> On 21 Dec 2024, at 23:18, Jordan West <jorda...@gmail.com >>>>>>>>>> <mailto:jorda...@gmail.com>> wrote: >>>>>>>>>> >>>>>>>>>> I tend to lean towards Josh's perspective. Gossip was poorly tested >>>>>>>>>> and implemented. I dont think it's a good parallel or at least I >>>>>>>>>> hope it's not. Taken to the extreme we shouldn't touch the database >>>>>>>>>> at all otherwise, which isn't practical. That said, anything >>>>>>>>>> touching important subsystems needs more care, testing, and time to >>>>>>>>>> bake. I think we're mostly discussing "being careful" of which I am >>>>>>>>>> totally on board with. I don't think Benedict ever said "don't use >>>>>>>>>> TCM", in fact he's said the opposite, but emphasized the care that >>>>>>>>>> is required when we do, which is totally reasonable. >>>>>>>>>> >>>>>>>>>> Back to capabilities, Riak built them on an eventually consistent >>>>>>>>>> subsystem and they worked fine. If you have a split brain you likely >>>>>>>>>> dont want to communicate agreement as is (or have already learned >>>>>>>>>> about agreement and its not an issue). That said, I don't think we >>>>>>>>>> have an EC layer in C* I would want to rely on outside of >>>>>>>>>> distributed tables. So in the context of what we have existing I >>>>>>>>>> think TCM is a better fit. I still need to dig a little more to be >>>>>>>>>> convinced and plan to do that as I draft the CEP. >>>>>>>>>> >>>>>>>>>> Jordan >>>>>>>>>> >>>>>>>>>> On Sat, Dec 21, 2024 at 5:51 AM Benedict <bened...@apache.org >>>>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>>>> >>>>>>>>>> I’m not saying we need to tease out bugs from TCM. I’m saying every >>>>>>>>>> time someone touches something this central to correctness we >>>>>>>>>> introduce a risk of breaking it, and that we should exercise that >>>>>>>>>> risk judiciously. This has zero to do with the amount of data we’re >>>>>>>>>> pushing through it, and 100% to do with writing bad code. >>>>>>>>>> >>>>>>>>>> We treated gossip carefully in part because it was hard to work >>>>>>>>>> with, but in part because getting it wrong was particularly bad. We >>>>>>>>>> should retain the latter reason for caution. >>>>>>>>>> >>>>>>>>>> We also absolutely do not need TCM for consistency. We have >>>>>>>>>> consistent database functionality for that. TCM is special because >>>>>>>>>> it cannot rely on the database mechanisms, as it underpins them. >>>>>>>>>> That is the whole point of why we should treat it carefully. >>>>>>>>>> >>>>>>>>>>> On 21 Dec 2024, at 13:43, Josh McKenzie <jmcken...@apache.org >>>>>>>>>>> <mailto:jmcken...@apache.org>> wrote: >>>>>>>>>>> >>>>>>>>>>> To play the devil's advocate - the more we exercise TCM the more >>>>>>>>>>> bugs we suss out. To Jon's point, the volume of information we're >>>>>>>>>>> talking about here in terms of capabilities dissemination shouldn't >>>>>>>>>>> stress TCM at all. >>>>>>>>>>> >>>>>>>>>>> I think a reasonable heuristic for relying on TCM for something is >>>>>>>>>>> whether there's a big difference in UX on something being >>>>>>>>>>> eventually consistent vs. strongly consistent. Exposing features to >>>>>>>>>>> clients based on whether the entire cluster supports them seems >>>>>>>>>>> like the kind of thing that could cause pain if we're in a >>>>>>>>>>> split-brain, cluster-is-settling-on-agreement kind of paradigm. >>>>>>>>>>> >>>>>>>>>>> On Fri, Dec 20, 2024, at 3:17 PM, Benedict wrote: >>>>>>>>>>>> >>>>>>>>>>>> Mostly conceptual; the problem with a linearizable history is that >>>>>>>>>>>> if you lose some of it (eg because some logic bug prevents you >>>>>>>>>>>> from processing some epoch) you stop the world until an operator >>>>>>>>>>>> can step in to perform surgery about what the history should be. >>>>>>>>>>>> >>>>>>>>>>>> I do know of one recent bug to schema changes in cep-15 that broke >>>>>>>>>>>> TCM in this way. That particular avenue will be hardened, but the >>>>>>>>>>>> fewer places we risk this the better IMO. >>>>>>>>>>>> >>>>>>>>>>>> Of course, there are steps we could take to expose a limited API >>>>>>>>>>>> targeting these use cases, as well as using a separate log for >>>>>>>>>>>> ancillary functionality, that might better balance risk:reward. >>>>>>>>>>>> But equally I’m not sure it makes sense to TCM all the things, and >>>>>>>>>>>> maybe dogfooding our own database features and developing >>>>>>>>>>>> functionality that enables our own use cases could be better where >>>>>>>>>>>> it isn’t necessary 🤷♀️ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 20 Dec 2024, at 19:22, Jordan West <jorda...@gmail.com >>>>>>>>>>>>> <mailto:jorda...@gmail.com>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Dec 20, 2024 at 11:06 AM Benedict <bened...@apache.org >>>>>>>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> If TCM breaks we all have a really bad time, much worse than if >>>>>>>>>>>>> any one of these features individually has problems. If you break >>>>>>>>>>>>> TCM in the right way the cluster could become inoperable, or >>>>>>>>>>>>> operations like topology changes may be prevented. >>>>>>>>>>>>> >>>>>>>>>>>>> Benedict, when you say this are you speaking hypothetically (in >>>>>>>>>>>>> the sense that by using TCM more we increase the probability of >>>>>>>>>>>>> using it "wrong" and hitting an unknown edge case) or are there >>>>>>>>>>>>> known ways today that TCM "breaks"? >>>>>>>>>>>>> >>>>>>>>>>>>> Jordan >>>>>>>>>>>>> >>>>>>>>>>>>> This means that even a parallel log has some risk if we end up >>>>>>>>>>>>> modifying shared functionality. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On 20 Dec 2024, at 18:47, Štefan Miklošovič >>>>>>>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I stand corrected. C in TCM is "cluster" :D Anyway. >>>>>>>>>>>>>> Configuration is super reasonable to be put there. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:42 PM Štefan Miklošovič >>>>>>>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>>>>>> I am super hesitant to base distributed guardrails or any >>>>>>>>>>>>>> configuration for that matter on anything but TCM. Does not "C" >>>>>>>>>>>>>> in TCM stand for "configuration" anyway? So rename it to TSM >>>>>>>>>>>>>> like "schema" then if it is meant to be just for that. It seems >>>>>>>>>>>>>> to be quite ridiculous to code tables with caches on top when we >>>>>>>>>>>>>> have way more effective tooling thanks to CEP-21 to deal with >>>>>>>>>>>>>> that with clear advantages of getting rid of all of that old >>>>>>>>>>>>>> mechanism we have in place. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have not seen any concrete examples of risks why using TCM >>>>>>>>>>>>>> should be just for what it is currently for. Why not put the >>>>>>>>>>>>>> configuration meant to be cluster-wide into that? >>>>>>>>>>>>>> >>>>>>>>>>>>>> What is it ... performance? What does even the term "additional >>>>>>>>>>>>>> complexity" mean? Complex in what? Do you think that putting >>>>>>>>>>>>>> there 3 types of transformations in case of guardrails which >>>>>>>>>>>>>> flip some booleans and numbers would suddenly make TCM way more >>>>>>>>>>>>>> complex? Come on ... >>>>>>>>>>>>>> >>>>>>>>>>>>>> This has nothing to do with what Jordan is trying to introduce. >>>>>>>>>>>>>> I think we all agree he knows what he is doing and if he >>>>>>>>>>>>>> evaluates that TCM is too much for his use case (or it is not a >>>>>>>>>>>>>> good fit) that is perfectly fine. >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:22 PM Paulo Motta <pa...@apache.org >>>>>>>>>>>>>> <mailto:pa...@apache.org>> wrote: >>>>>>>>>>>>>> > It should be possible to use distributed system tables just >>>>>>>>>>>>>> > fine for capabilities, config and guardrails. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have been thinking about this recently and I agree we should >>>>>>>>>>>>>> be wary about introducing new TCM states and create additional >>>>>>>>>>>>>> complexity that can be serviced by existing data dissemination >>>>>>>>>>>>>> mechanisms (gossip/system tables). I would prefer that we take a >>>>>>>>>>>>>> more phased and incremental approach to introduce new TCM states. >>>>>>>>>>>>>> >>>>>>>>>>>>>> As a way to accomplish that, I have thought about introducing a >>>>>>>>>>>>>> new generic TCM state "In Maintenance", where schema or >>>>>>>>>>>>>> membership changes are "frozen/disallowed" while an external >>>>>>>>>>>>>> operation is taking place. This "external operation" could mean >>>>>>>>>>>>>> many things: >>>>>>>>>>>>>> - Upgrade >>>>>>>>>>>>>> - Downgrade >>>>>>>>>>>>>> - Migration >>>>>>>>>>>>>> - Capability Enablement/Disablement >>>>>>>>>>>>>> >>>>>>>>>>>>>> These could be sub-states of the "Maintenance" TCM state, that >>>>>>>>>>>>>> could be managed externally (via cache/gossip/system >>>>>>>>>>>>>> tables/sidecar). Once these sub-states are validated thouroughly >>>>>>>>>>>>>> and mature enough, we could "promote" them to top-level TCM >>>>>>>>>>>>>> states. >>>>>>>>>>>>>> >>>>>>>>>>>>>> In the end what really matters is that cluster and schema >>>>>>>>>>>>>> membership changes do not happen while a miscellaneous operation >>>>>>>>>>>>>> is taking place. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Would this make sense as an initial way to integrate TCM with >>>>>>>>>>>>>> capabilities framework ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 4:53 AM Benedict <bened...@apache.org >>>>>>>>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> If you perform a read from a distributed table on startup you >>>>>>>>>>>>>> will find the latest information. What catchup are you thinking >>>>>>>>>>>>>> of? I don’t think any of the features we talked about need a >>>>>>>>>>>>>> log, only the latest information. >>>>>>>>>>>>>> >>>>>>>>>>>>>> We can (and should) probably introduce event listeners for >>>>>>>>>>>>>> distributed tables, as this is also a really great feature, but >>>>>>>>>>>>>> I don’t think this should be necessary here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regarding disagreements: if you use LWTs then there are no >>>>>>>>>>>>>> consistency issues to worry about. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Again, I’m not opposed to using TCM, although I am a little >>>>>>>>>>>>>> worried TCM is becoming our new hammer with everything a nail. >>>>>>>>>>>>>> It would be better IMO to keep TCM scoped to essential >>>>>>>>>>>>>> functionality as it’s critical to correctness. Perhaps we could >>>>>>>>>>>>>> extend its APIs to less critical services without intertwining >>>>>>>>>>>>>> them with membership, schema and epoch handling. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 20 Dec 2024, at 09:43, Štefan Miklošovič >>>>>>>>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I find TCM way more comfortable to work with. The capability of >>>>>>>>>>>>>>> log being replayed on restart and catching up with everything >>>>>>>>>>>>>>> else automatically is god-sent. If we had that on "good old >>>>>>>>>>>>>>> distributed tables", then is it not true that we would need to >>>>>>>>>>>>>>> take extra care of that, e.g. we would need to repair it etc >>>>>>>>>>>>>>> ... It might be the source of the discrepancies / disagreements >>>>>>>>>>>>>>> etc. TCM is just "maintenance-free" and _just works_. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think I was also investigating distributed tables but was >>>>>>>>>>>>>>> just pulled towards TCM naturally because of its goodies. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 10:08 AM Benedict <bened...@apache.org >>>>>>>>>>>>>>> <mailto:bened...@apache.org>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> TCM is a perfectly valid basis for this, but TCM is only really >>>>>>>>>>>>>>> *necessary* to solve meta config problems where we can’t rely >>>>>>>>>>>>>>> on the rest of the database working. Particularly placement >>>>>>>>>>>>>>> issues, which is why schema and membership need to live there. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> It should be possible to use distributed system tables just >>>>>>>>>>>>>>> fine for capabilities, config and guardrails. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> That said, it’s possible config might be better represented as >>>>>>>>>>>>>>> part of the schema (and we already store some relevant config >>>>>>>>>>>>>>> there) in which case it would live in TCM automatically. >>>>>>>>>>>>>>> Migrating existing configs to a distributed setup will be fun >>>>>>>>>>>>>>> however we do it though. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Capabilities also feel naturally related to other membership >>>>>>>>>>>>>>> information, so TCM might be the most suitable place, >>>>>>>>>>>>>>> particularly for handling downgrades after capabilities have >>>>>>>>>>>>>>> been enabled (if we ever expect to support turning off >>>>>>>>>>>>>>> capabilities and then downgrading - which today we mostly >>>>>>>>>>>>>>> don’t). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 20 Dec 2024, at 08:42, Štefan Miklošovič >>>>>>>>>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jordan, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I also think that having it on TCM would be ideal and we >>>>>>>>>>>>>>>> should explore this path first before doing anything custom. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regarding my idea about the guardrails in TCM, when I >>>>>>>>>>>>>>>> prototyped that and wanted to make it happen, there was a >>>>>>>>>>>>>>>> little bit of a pushback (1) (even though super reasonable >>>>>>>>>>>>>>>> one) that TCM is just too young at the moment and it would be >>>>>>>>>>>>>>>> desirable to go through some stabilisation period. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Another idea was that we should not make just guardrails >>>>>>>>>>>>>>>> happen but the whole config should be in TCM. From what I put >>>>>>>>>>>>>>>> together, Sam / Alex does not seem to be opposed to this idea, >>>>>>>>>>>>>>>> rather the opposite, but having CEP about that is way more >>>>>>>>>>>>>>>> involved than having just guardrails there. I consider >>>>>>>>>>>>>>>> guardrails to be kind of special and I do not think that >>>>>>>>>>>>>>>> having all configurations in TCM (which guardrails are part >>>>>>>>>>>>>>>> of) is the absolute must in order to deliver that. I may start >>>>>>>>>>>>>>>> with guardrails CEP and you may explore Capabilities CEP on >>>>>>>>>>>>>>>> TCM too, if that makes sense? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I just wanted to raise the point about the time this would be >>>>>>>>>>>>>>>> delivered. If Capabilities are built on TCM and I wanted to do >>>>>>>>>>>>>>>> Guardrails on TCM too but was explained it is probably too >>>>>>>>>>>>>>>> soon, I guess you would experience something similar. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Sam's comment is from May and maybe a lot has changed since in >>>>>>>>>>>>>>>> then and his comment is not applicable anymore. It would be >>>>>>>>>>>>>>>> great to know if we could build on top of the current trunk >>>>>>>>>>>>>>>> already or we will wait until 5.1/6.0 is delivered. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (1) >>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-19593?focusedCommentId=17844326&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17844326 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 2:17 AM Jordan West >>>>>>>>>>>>>>>> <jorda...@gmail.com <mailto:jorda...@gmail.com>> wrote: >>>>>>>>>>>>>>>> Firstly, glad to see the support and enthusiasm here and in >>>>>>>>>>>>>>>> the recent Slack discussion. I think there is enough for me to >>>>>>>>>>>>>>>> start drafting a CEP. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Stefan, global configuration and capabilities do have some >>>>>>>>>>>>>>>> overlap but not full overlap. For example, you may want to set >>>>>>>>>>>>>>>> globally that a cluster enables feature X or control the >>>>>>>>>>>>>>>> threshold for a guardrail but you still need to know if all >>>>>>>>>>>>>>>> nodes support feature X or have that guardrail, the latter is >>>>>>>>>>>>>>>> what capabilities targets. I do think capabilities are a step >>>>>>>>>>>>>>>> towards supporting global configuration and the work you >>>>>>>>>>>>>>>> described is another step (that we could do after capabilities >>>>>>>>>>>>>>>> or in parallel with them in mind). I am also supportive of >>>>>>>>>>>>>>>> exploring global configuration for the reasons you mentioned. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> In terms of how capabilities get propagated across the >>>>>>>>>>>>>>>> cluster, I hadn't put much thought into it yet past likely TCM >>>>>>>>>>>>>>>> since this will be a new feature that lands after TCM. In >>>>>>>>>>>>>>>> Riak, we had gossip (but more mature than C*s -- this was an >>>>>>>>>>>>>>>> area I contributed to a lot so very familiar) to disseminate >>>>>>>>>>>>>>>> less critical information such as capabilities and a separate >>>>>>>>>>>>>>>> layer that did TCM. Since we don't have this in C* I don't >>>>>>>>>>>>>>>> think we would want to build a separate distribution channel >>>>>>>>>>>>>>>> for capabilities metadata when we already have TCM in place. >>>>>>>>>>>>>>>> But I plan to explore this more as I draft the CEP. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jordan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 1:48 PM Štefan Miklošovič >>>>>>>>>>>>>>>> <smikloso...@apache.org <mailto:smikloso...@apache.org>> wrote: >>>>>>>>>>>>>>>> Hi Jordan, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> what would this look like from the implementation perspective? >>>>>>>>>>>>>>>> I was experimenting with transactional guardrails where an >>>>>>>>>>>>>>>> operator would control the content of a virtual table which >>>>>>>>>>>>>>>> would be backed by TCM so whatever guardrail we would change, >>>>>>>>>>>>>>>> this would be automatically and transparently propagated to >>>>>>>>>>>>>>>> every node in a cluster. The POC worked quite nicely. TCM is >>>>>>>>>>>>>>>> just a vehicle to commit a change which would spread around >>>>>>>>>>>>>>>> and all these settings would survive restarts. We would have >>>>>>>>>>>>>>>> the same configuration everywhere which is not currently the >>>>>>>>>>>>>>>> case because guardrails are configured per node and if not >>>>>>>>>>>>>>>> persisted to yaml, on restart their values would be forgotten. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Guardrails are just an example, what is quite obvious is to >>>>>>>>>>>>>>>> expand this idea to the whole configuration in yaml. Of >>>>>>>>>>>>>>>> course, not all properties in yaml make sense to be the same >>>>>>>>>>>>>>>> cluster-wise (ip addresses etc ...), but the ones which do >>>>>>>>>>>>>>>> would be again set everywhere the same way. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The approach I described above is that we make sure that the >>>>>>>>>>>>>>>> configuration is same everywhere, hence there can be no >>>>>>>>>>>>>>>> misunderstanding what features this or that node has, if we >>>>>>>>>>>>>>>> say that all nodes have to have a particular feature because >>>>>>>>>>>>>>>> we said so in TCM log so on restart / replay, a node with >>>>>>>>>>>>>>>> "catch up" with whatever features it is asked to turn on. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Your approach seems to be that we distribute what all >>>>>>>>>>>>>>>> capabilities / features a cluster supports and that each >>>>>>>>>>>>>>>> individual node configures itself in some way or not to comply? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Is there any intersection in these approaches? At first sight >>>>>>>>>>>>>>>> it seems somehow related. How is one different from another >>>>>>>>>>>>>>>> from your point of view? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Regards >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-19593 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 12:00 AM Jordan West <jw...@apache.org >>>>>>>>>>>>>>>> <mailto:jw...@apache.org>> wrote: >>>>>>>>>>>>>>>> In a recent discussion on the pains of upgrading one topic >>>>>>>>>>>>>>>> that came up is a feature that Riak had called Capabilities >>>>>>>>>>>>>>>> [1]. A major pain with upgrades is that each node >>>>>>>>>>>>>>>> independently decides when to start using new or modified >>>>>>>>>>>>>>>> functionality. Even when we put this behind a config (like >>>>>>>>>>>>>>>> storage compatibility mode) each node immediately enables the >>>>>>>>>>>>>>>> feature when the config is changed and the node is restarted. >>>>>>>>>>>>>>>> This causes various types of upgrade pain such as failed >>>>>>>>>>>>>>>> streams and schema disagreement. A recent example of this is >>>>>>>>>>>>>>>> CASSANRA-20118 [2]. In some cases operators can prevent this >>>>>>>>>>>>>>>> from happening through careful coordination (e.g. ensuring >>>>>>>>>>>>>>>> upgrade sstables only runs after the whole cluster is >>>>>>>>>>>>>>>> upgraded) but typically requires custom code in whatever >>>>>>>>>>>>>>>> control plane the operator is using. A capabilities framework >>>>>>>>>>>>>>>> would distribute the state of what features each node has (and >>>>>>>>>>>>>>>> their status e.g. enabled or not) so that the cluster can >>>>>>>>>>>>>>>> choose to opt in to new features once the whole cluster has >>>>>>>>>>>>>>>> them available. From experience, having this in Riak made >>>>>>>>>>>>>>>> upgrades a significantly less risky process and also paved a >>>>>>>>>>>>>>>> path towards repeatable downgrades. I think Cassandra would >>>>>>>>>>>>>>>> benefit from it as well. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Further, other tools like analytics could benefit from having >>>>>>>>>>>>>>>> this information since currently it's up to the operator to >>>>>>>>>>>>>>>> manually determine the state of the cluster in some cases. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am considering drafting a CEP proposal for this feature but >>>>>>>>>>>>>>>> wanted to take the general temperature of the community and >>>>>>>>>>>>>>>> get some early thoughts while working on the draft. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Looking forward to hearing y'alls thoughts, >>>>>>>>>>>>>>>> Jordan >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>> https://github.com/basho/riak_core/blob/25d9a6fa917eb8a2e95795d64eb88d7ad384ed88/src/riak_core_capability.erl#L23-L72 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/CASSANDRA-20118 >>