Re: Capabilities

Josh McKenzie Wed, 29 Jan 2025 08:01:14 -0800

Using TCM to distribute this information across the cluster vs. using some 
other LWT-ish distributed CP solution higher in the stack should effectively 
have the same UX guarantees to us and our users right? So I think it's still 
quite viable, even if we're just LWT'ing things into distributed tables, doing 
something silly like CL_ALL, etc.


On Wed, Jan 29, 2025, at 5:44 AM, Štefan Miklošovič wrote:
> I want to ask about this ticket in particular, I know I am somehow hijacking 
> this thread but taking recent discussion into account where we kind of 
> rejected the idea of using TCM log for storing configuration, what does this 
> mean for tickets like this? Is this still viable or we need to completely 
> diverge from this approach and figure out something else?
> 
> Thanks
> 
> (1) https://issues.apache.org/jira/browse/CASSANDRA-19130
> 
> On Tue, Jan 7, 2025 at 1:04 PM Štefan Miklošovič <smikloso...@apache.org> 
> wrote:
>> It would be cool if it was acting like this, then the whole plugin would 
>> become irrelevant when it comes to the migrations. 
>> 
>> https://github.com/instaclustr/cassandra-everywhere-strategy
>> https://github.com/instaclustr/cassandra-everywhere-strategy?tab=readme-ov-file#motivation
>> 
>> On Mon, Jan 6, 2025 at 11:09 PM Jon Haddad <j...@rustyrazorblade.com> wrote:
>>> What about finally adding a much desired EverywhereStrategy?  It wouldn't 
>>> just be useful for config - system_auth bites a lot of people today. 
>>> 
>>> As much as I don't like to suggest row cache, it might be a good fit here 
>>> as well.  We could remove the custom code around auth cache in the process.
>>> 
>>> Jon
>>> 
>>> On Mon, Jan 6, 2025 at 12:48 PM Benedict Elliott Smith 
>>> <bened...@apache.org> wrote:
>>>> The more we talk about this, the more my position crystallises against 
>>>> this approach. The feature we’re discussing here should be easy to 
>>>> implement on top of user facing functionality; we aren’t the only people 
>>>> who want functionality like this. We should be dogfooding our own UX for 
>>>> this kind of capability.
>>>> 
>>>> TCM is unique in that it *cannot* dogfood the database. As a result is is 
>>>> not only critical for correctness, it’s also more complex - and 
>>>> inefficient - than a native database feature could be. It’s the worst of 
>>>> both worlds: we couple critical functionality to non-critical features, 
>>>> and couple those non-critical features to more complex logic than they 
>>>> need.
>>>> 
>>>> My vote would be to introduce a new table feature that provides a 
>>>> node-local time bounded cache, so that you can safely perform CL.ONE 
>>>> queries against it, and let the whole world use it. 
>>>> 
>>>> 
>>>>> On 6 Jan 2025, at 18:23, Blake Eggleston <beggles...@apple.com> wrote:
>>>>> 
>>>>>>>>>> TCM was designed with a couple of very specific correctness-critical 
>>>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend.
>>>>> 
>>>>> Its initial scope was for those use cases, but it’s potential for 
>>>>> enabling more sophisticated functionality was one of its selling points 
>>>>> and is listed in the CEP.
>>>>> 
>>>>>> Folks transitively breaking cluster membership by accidentally breaking 
>>>>>> the shared dependency of a non-critical feature is a risk I don’t like 
>>>>>> much.
>>>>> 
>>>>> Having multiple distributed config systems operating independently is 
>>>>> going to create it’s own set of problems, especially if the distributed 
>>>>> config has any level of interaction with schema or topology.
>>>>> 
>>>>> I lean towards distributed config going into TCM, although a more 
>>>>> friendly api for extension that offers some guardrails would be a good 
>>>>> idea.
>>>>> 
>>>>>> On Jan 6, 2025, at 9:21 AM, Aleksey Yeshchenko <alek...@apple.com> wrote:
>>>>>> 
>>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have a 
>>>>>>> good mental model on its properties, so i assumed that it could be used 
>>>>>>> to disseminate arbitrary key value pairs like config fairly easily. 
>>>>>> 
>>>>>> It’s more than *capable* of disseminating arbitrary-ish key-value pairs 
>>>>>> - it can deal with schema after all.
>>>>>> 
>>>>>> I claim it to be *unsuitable* because of the coupling it would introduce 
>>>>>> between components of different levels of criticality. You can derisk it 
>>>>>> partially by having separate logs (which might not be trivial to 
>>>>>> implement). But unless you also duplicate all the TCM logic in some 
>>>>>> other package, the shared code dependency coupling persists. Folks 
>>>>>> transitively breaking cluster membership by accidentally breaking the 
>>>>>> shared dependency of a non-critical feature is a risk I don’t like much. 
>>>>>> Keep it tight, single-purpose, let it harden over time without being 
>>>>>> disrupted.
>>>>>> 
>>>>>>> On 6 Jan 2025, at 16:54, Aleksey Yeshchenko <alek...@apple.com> wrote:
>>>>>>> 
>>>>>>> I agree that this would be useful, yes.
>>>>>>> 
>>>>>>> An LWT/Accord variant plus a plain writes eventually consistent 
>>>>>>> variant. A generic-by-design internal-only per-table mechanism with 
>>>>>>> optional caching + optional write notifications issued to non-replicas.
>>>>>>> 
>>>>>>>> On 6 Jan 2025, at 14:26, Josh McKenzie <jmcken...@apache.org> wrote:
>>>>>>>> 
>>>>>>>>> I think if we go down the route of pushing configs around with LWT + 
>>>>>>>>> caching instead, we should have that be a generic system that is 
>>>>>>>>> designed for everyone to use. 
>>>>>>>> Agreed. Otherwise we end up with the same problem Aleksey's speaking 
>>>>>>>> about above, where we build something for a specific purpose and then 
>>>>>>>> maintainers in the future with a reasonable need extend or bend it to 
>>>>>>>> fit their new need, risking destabilizing the original implementation.
>>>>>>>> 
>>>>>>>> Better to have a solid shared primitive other features can build upon.
>>>>>>>> 
>>>>>>>> On Mon, Jan 6, 2025, at 8:33 AM, Jon Haddad wrote:
>>>>>>>>> Would you mind elaborating on what makes it unsuitable? I don’t have 
>>>>>>>>> a good mental model on its properties, so i assumed that it could be 
>>>>>>>>> used to disseminate arbitrary key value pairs like config fairly 
>>>>>>>>> easily. 
>>>>>>>>> 
>>>>>>>>> Somewhat humorously, i think that same assumption was made when 
>>>>>>>>> putting sai metadata into gossip which caused a cluster with 800 2i 
>>>>>>>>> to break it. 
>>>>>>>>> 
>>>>>>>>> I think if we go down the route of pushing configs around with LWT + 
>>>>>>>>> caching instead, we should have that be a generic system that is 
>>>>>>>>> designed for everyone to use. Then we have a gossip replacement, 
>>>>>>>>> reduce config clutter, and people have something that can be used 
>>>>>>>>> without adding another bespoke system into the mix. 
>>>>>>>>> 
>>>>>>>>> Jon 
>>>>>>>>> 
>>>>>>>>> On Mon, Jan 6, 2025 at 6:48 AM Aleksey Yeshchenko <alek...@apple.com> 
>>>>>>>>> wrote:
>>>>>>>>>> TCM was designed with a couple of very specific correctness-critical 
>>>>>>>>>> use cases in mind, not as a generic mechanism for everyone to extend.
>>>>>>>>>> 
>>>>>>>>>> It might be *convenient* to employ TCM for some other features, 
>>>>>>>>>> which makes it tempting to abuse TCM for an unintended purpose, but 
>>>>>>>>>> we shouldn’t do what's convenient over what is right. There are 
>>>>>>>>>> several ways this often goes wrong.
>>>>>>>>>> 
>>>>>>>>>> For example, the sybsystem gets used as is, without modification, by 
>>>>>>>>>> a new feature, but in ways that invalidate the assumptions behind 
>>>>>>>>>> the design of the subsystem - designed for particular use cases.
>>>>>>>>>> 
>>>>>>>>>> For another example, the subsystem *almost* works as is for the new 
>>>>>>>>>> feature, but doesn't *quite* work as is, so changes are made to it, 
>>>>>>>>>> and reviewed, by someone not familiar enough with the subsystem 
>>>>>>>>>> design and implementation. One of such changes eventually introduces 
>>>>>>>>>> a bug to the shared critical subsystem, and now everyone is having a 
>>>>>>>>>> bad time.
>>>>>>>>>> 
>>>>>>>>>> The risks are real, and I’d strongly prefer that we didn’t co-opt a 
>>>>>>>>>> critical subsystem for a non-critical use-case for this reason alone.
>>>>>>>>>> 
>>>>>>>>>>> On 21 Dec 2024, at 23:18, Jordan West <jorda...@gmail.com> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> I tend to lean towards Josh's perspective. Gossip was poorly tested 
>>>>>>>>>>> and implemented. I dont think it's a good parallel or at least I 
>>>>>>>>>>> hope it's not. Taken to the extreme we shouldn't touch the database 
>>>>>>>>>>> at all otherwise, which isn't practical. That said, anything 
>>>>>>>>>>> touching important subsystems needs more care, testing, and time to 
>>>>>>>>>>> bake. I think we're mostly discussing "being careful" of which I am 
>>>>>>>>>>> totally on board with. I don't think Benedict ever said "don't use 
>>>>>>>>>>> TCM", in fact he's said the opposite, but emphasized the care that 
>>>>>>>>>>> is required when we do, which is totally reasonable. 
>>>>>>>>>>>   
>>>>>>>>>>> Back to capabilities, Riak built them on an eventually consistent 
>>>>>>>>>>> subsystem and they worked fine. If you have a split brain you 
>>>>>>>>>>> likely dont want to communicate agreement as is (or have already 
>>>>>>>>>>> learned about agreement and its not an issue). That said, I don't 
>>>>>>>>>>> think we have an EC layer in C* I would want to rely on outside of 
>>>>>>>>>>> distributed tables. So in the context of what we have existing I 
>>>>>>>>>>> think TCM is a better fit. I still need to dig a little more to be 
>>>>>>>>>>> convinced and plan to do that as I draft the CEP.
>>>>>>>>>>> 
>>>>>>>>>>> Jordan
>>>>>>>>>>> 
>>>>>>>>>>> On Sat, Dec 21, 2024 at 5:51 AM Benedict <bened...@apache.org> 
>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>> I’m not saying we need to tease out bugs from TCM. I’m saying 
>>>>>>>>>>>> every time someone touches something this central to correctness 
>>>>>>>>>>>> we introduce a risk of breaking it, and that we should exercise 
>>>>>>>>>>>> that risk judiciously. This has zero to do with the amount of data 
>>>>>>>>>>>> we’re pushing through it, and 100% to do with writing bad code.
>>>>>>>>>>>> 
>>>>>>>>>>>> We treated gossip carefully in part because it was hard to work 
>>>>>>>>>>>> with, but in part because getting it wrong was particularly bad. 
>>>>>>>>>>>> We should retain the latter reason for caution.
>>>>>>>>>>>> 
>>>>>>>>>>>> We also absolutely do not need TCM for consistency. We have 
>>>>>>>>>>>> consistent database functionality for that. TCM is special because 
>>>>>>>>>>>> it cannot rely on the database mechanisms, as it underpins them. 
>>>>>>>>>>>> That is the whole point of why we should treat it carefully.
>>>>>>>>>>>> 
>>>>>>>>>>>>> On 21 Dec 2024, at 13:43, Josh McKenzie <jmcken...@apache.org> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> To play the devil's advocate - the more we exercise TCM the more 
>>>>>>>>>>>>> bugs we suss out. To Jon's point, the volume of information we're 
>>>>>>>>>>>>> talking about here in terms of capabilities dissemination 
>>>>>>>>>>>>> shouldn't stress TCM at all.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I think a reasonable heuristic for relying on TCM for something 
>>>>>>>>>>>>> is whether there's a big difference in UX on something being 
>>>>>>>>>>>>> eventually consistent vs. strongly consistent. Exposing features 
>>>>>>>>>>>>> to clients based on whether the entire cluster supports them 
>>>>>>>>>>>>> seems like the kind of thing that could cause pain if we're in a 
>>>>>>>>>>>>> split-brain, cluster-is-settling-on-agreement kind of paradigm.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Fri, Dec 20, 2024, at 3:17 PM, Benedict wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Mostly conceptual; the problem with a linearizable history is 
>>>>>>>>>>>>>> that if you lose some of it (eg because some logic bug prevents 
>>>>>>>>>>>>>> you from processing some epoch) you stop the world until an 
>>>>>>>>>>>>>> operator can step in to perform surgery about what the history 
>>>>>>>>>>>>>> should be.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I do know of one recent bug to schema changes in cep-15 that 
>>>>>>>>>>>>>> broke TCM in this way. That particular avenue will be hardened, 
>>>>>>>>>>>>>> but the fewer places we risk this the better IMO. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Of course, there are steps we could take to expose a limited API 
>>>>>>>>>>>>>> targeting these use cases, as well as using a separate log for 
>>>>>>>>>>>>>> ancillary functionality, that might better balance risk:reward. 
>>>>>>>>>>>>>> But equally I’m not sure it makes sense to TCM all the things, 
>>>>>>>>>>>>>> and maybe dogfooding our own database features and developing 
>>>>>>>>>>>>>> functionality that enables our own use cases could be better 
>>>>>>>>>>>>>> where it isn’t necessary 🤷‍♀️
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 20 Dec 2024, at 19:22, Jordan West <jorda...@gmail.com> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 11:06 AM Benedict <bened...@apache.org> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> If TCM breaks we all have a really bad time, much worse than 
>>>>>>>>>>>>>>>> if any one of these features individually has problems. If you 
>>>>>>>>>>>>>>>> break TCM in the right way the cluster could become 
>>>>>>>>>>>>>>>> inoperable, or operations like topology changes may be 
>>>>>>>>>>>>>>>> prevented. 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Benedict, when you say this are you speaking hypothetically (in 
>>>>>>>>>>>>>>> the sense that by using TCM more we increase the probability of 
>>>>>>>>>>>>>>> using it "wrong" and hitting an unknown edge case) or are there 
>>>>>>>>>>>>>>> known ways today that TCM "breaks"?  
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Jordan
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> This means that even a parallel log has some risk if we end up 
>>>>>>>>>>>>>>>> modifying shared functionality.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On 20 Dec 2024, at 18:47, Štefan Miklošovič 
>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> I stand corrected. C in TCM is "cluster" :D Anyway. 
>>>>>>>>>>>>>>>>> Configuration is super reasonable to be put there.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:42 PM Štefan Miklošovič 
>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote:
>>>>>>>>>>>>>>>>>> I am super hesitant to base distributed guardrails or any 
>>>>>>>>>>>>>>>>>> configuration for that matter on anything but TCM. Does not 
>>>>>>>>>>>>>>>>>> "C" in TCM stand for "configuration" anyway? So rename it to 
>>>>>>>>>>>>>>>>>> TSM like "schema" then if it is meant to be just for that. 
>>>>>>>>>>>>>>>>>> It seems to be quite ridiculous to code tables with caches 
>>>>>>>>>>>>>>>>>> on top when we have way more effective tooling thanks to 
>>>>>>>>>>>>>>>>>> CEP-21 to deal with that with clear advantages of getting 
>>>>>>>>>>>>>>>>>> rid of all of that old mechanism we have in place.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I have not seen any concrete examples of risks why using TCM 
>>>>>>>>>>>>>>>>>> should be just for what it is currently for. Why not put the 
>>>>>>>>>>>>>>>>>> configuration meant to be cluster-wide into that?
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> What is it ... performance? What does even the term 
>>>>>>>>>>>>>>>>>> "additional complexity" mean? Complex in what? Do you think 
>>>>>>>>>>>>>>>>>> that putting there 3 types of transformations in case of 
>>>>>>>>>>>>>>>>>> guardrails which flip some booleans and numbers would 
>>>>>>>>>>>>>>>>>> suddenly make TCM way more complex? Come on ...
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> This has nothing to do with what Jordan is trying to 
>>>>>>>>>>>>>>>>>> introduce. I think we all agree he knows what he is doing 
>>>>>>>>>>>>>>>>>> and if he evaluates that TCM is too much for his use case 
>>>>>>>>>>>>>>>>>> (or it is not a good fit) that is perfectly fine. 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 7:22 PM Paulo Motta 
>>>>>>>>>>>>>>>>>> <pa...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>> > It should be possible to use distributed system tables 
>>>>>>>>>>>>>>>>>>> > just fine for capabilities, config and guardrails.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I have been thinking about this recently and I agree we 
>>>>>>>>>>>>>>>>>>> should be wary about introducing new TCM states and create 
>>>>>>>>>>>>>>>>>>> additional complexity that can be serviced by existing data 
>>>>>>>>>>>>>>>>>>> dissemination mechanisms (gossip/system tables). I would 
>>>>>>>>>>>>>>>>>>> prefer that we take a more phased and incremental approach 
>>>>>>>>>>>>>>>>>>> to introduce new TCM states.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> As a way to accomplish that, I have thought about 
>>>>>>>>>>>>>>>>>>> introducing a new generic TCM state "In Maintenance", where 
>>>>>>>>>>>>>>>>>>> schema or membership changes are "frozen/disallowed" while 
>>>>>>>>>>>>>>>>>>> an external operation is taking place. This "external 
>>>>>>>>>>>>>>>>>>> operation" could mean many things:
>>>>>>>>>>>>>>>>>>> - Upgrade
>>>>>>>>>>>>>>>>>>> - Downgrade
>>>>>>>>>>>>>>>>>>> - Migration
>>>>>>>>>>>>>>>>>>> - Capability Enablement/Disablement
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> These could be sub-states of the "Maintenance" TCM state, 
>>>>>>>>>>>>>>>>>>> that could be managed externally (via cache/gossip/system 
>>>>>>>>>>>>>>>>>>> tables/sidecar). Once these sub-states are validated 
>>>>>>>>>>>>>>>>>>> thouroughly and mature enough, we could "promote" them to 
>>>>>>>>>>>>>>>>>>> top-level TCM states.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> In the end what really matters is that cluster and schema 
>>>>>>>>>>>>>>>>>>> membership changes do not happen while a miscellaneous 
>>>>>>>>>>>>>>>>>>> operation is taking place.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Would this make sense as an initial way to integrate TCM 
>>>>>>>>>>>>>>>>>>> with capabilities framework ?
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 4:53 AM Benedict 
>>>>>>>>>>>>>>>>>>> <bened...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> If you perform a read from a distributed table on startup 
>>>>>>>>>>>>>>>>>>>> you will find the latest information. What catchup are you 
>>>>>>>>>>>>>>>>>>>> thinking of? I don’t think any of the features we talked 
>>>>>>>>>>>>>>>>>>>> about need a log, only the latest information.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> We can (and should) probably introduce event listeners for 
>>>>>>>>>>>>>>>>>>>> distributed tables, as this is also a really great 
>>>>>>>>>>>>>>>>>>>> feature, but I don’t think this should be necessary here.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Regarding disagreements: if you use LWTs then there are no 
>>>>>>>>>>>>>>>>>>>> consistency issues to worry about.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Again, I’m not opposed to using TCM, although I am a 
>>>>>>>>>>>>>>>>>>>> little worried TCM is becoming our new hammer with 
>>>>>>>>>>>>>>>>>>>> everything a nail. It would be better IMO to keep TCM 
>>>>>>>>>>>>>>>>>>>> scoped to essential functionality as it’s critical to 
>>>>>>>>>>>>>>>>>>>> correctness. Perhaps we could extend its APIs to less 
>>>>>>>>>>>>>>>>>>>> critical services without intertwining them with 
>>>>>>>>>>>>>>>>>>>> membership, schema and epoch handling.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On 20 Dec 2024, at 09:43, Štefan Miklošovič 
>>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I find TCM way more comfortable to work with. The 
>>>>>>>>>>>>>>>>>>>>> capability of log being replayed on restart and catching 
>>>>>>>>>>>>>>>>>>>>> up with everything else automatically is god-sent. If we 
>>>>>>>>>>>>>>>>>>>>> had that on "good old distributed tables", then is it not 
>>>>>>>>>>>>>>>>>>>>> true that we would need to take extra care of that, e.g. 
>>>>>>>>>>>>>>>>>>>>> we would need to repair it etc ... It might be the source 
>>>>>>>>>>>>>>>>>>>>> of the discrepancies / disagreements etc. TCM is just 
>>>>>>>>>>>>>>>>>>>>> "maintenance-free" and _just works_.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I think I was also investigating distributed tables but 
>>>>>>>>>>>>>>>>>>>>> was just pulled towards TCM naturally because of its 
>>>>>>>>>>>>>>>>>>>>> goodies.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 10:08 AM Benedict 
>>>>>>>>>>>>>>>>>>>>> <bened...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> TCM is a perfectly valid basis for this, but TCM is only 
>>>>>>>>>>>>>>>>>>>>>> really *necessary* to solve meta config problems where 
>>>>>>>>>>>>>>>>>>>>>> we can’t rely on the rest of the database working. 
>>>>>>>>>>>>>>>>>>>>>> Particularly placement issues, which is why schema and 
>>>>>>>>>>>>>>>>>>>>>> membership need to live there.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> It should be possible to use distributed system tables 
>>>>>>>>>>>>>>>>>>>>>> just fine for capabilities, config and guardrails.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> That said, it’s possible config might be better 
>>>>>>>>>>>>>>>>>>>>>> represented as part of the schema (and we already store 
>>>>>>>>>>>>>>>>>>>>>> some relevant config there) in which case it would live 
>>>>>>>>>>>>>>>>>>>>>> in TCM automatically. Migrating existing configs to a 
>>>>>>>>>>>>>>>>>>>>>> distributed setup will be fun however we do it though.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Capabilities also feel naturally related to other 
>>>>>>>>>>>>>>>>>>>>>> membership information, so TCM might be the most 
>>>>>>>>>>>>>>>>>>>>>> suitable place, particularly for handling downgrades 
>>>>>>>>>>>>>>>>>>>>>> after capabilities have been enabled (if we ever expect 
>>>>>>>>>>>>>>>>>>>>>> to support turning off capabilities and then downgrading 
>>>>>>>>>>>>>>>>>>>>>> - which today we mostly don’t).
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On 20 Dec 2024, at 08:42, Štefan Miklošovič 
>>>>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Jordan,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I also think that having it on TCM would be ideal and 
>>>>>>>>>>>>>>>>>>>>>>> we should explore this path first before doing anything 
>>>>>>>>>>>>>>>>>>>>>>> custom.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Regarding my idea about the guardrails in TCM, when I 
>>>>>>>>>>>>>>>>>>>>>>> prototyped that and wanted to make it happen, there was 
>>>>>>>>>>>>>>>>>>>>>>> a little bit of a pushback (1) (even though super 
>>>>>>>>>>>>>>>>>>>>>>> reasonable one) that TCM is just too young at the 
>>>>>>>>>>>>>>>>>>>>>>> moment and it would be desirable to go through some 
>>>>>>>>>>>>>>>>>>>>>>> stabilisation period.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Another idea was that we should not make just 
>>>>>>>>>>>>>>>>>>>>>>> guardrails happen but the whole config should be in 
>>>>>>>>>>>>>>>>>>>>>>> TCM. From what I put together, Sam / Alex does not seem 
>>>>>>>>>>>>>>>>>>>>>>> to be opposed to this idea, rather the opposite, but 
>>>>>>>>>>>>>>>>>>>>>>> having CEP about that is way more involved than having 
>>>>>>>>>>>>>>>>>>>>>>> just guardrails there. I consider guardrails to be kind 
>>>>>>>>>>>>>>>>>>>>>>> of special and I do not think that having all 
>>>>>>>>>>>>>>>>>>>>>>> configurations in TCM (which guardrails are part of) is 
>>>>>>>>>>>>>>>>>>>>>>> the absolute must in order to deliver that. I may start 
>>>>>>>>>>>>>>>>>>>>>>> with guardrails CEP and you may explore Capabilities 
>>>>>>>>>>>>>>>>>>>>>>> CEP on TCM too, if that makes sense?
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I just wanted to raise the point about the time this 
>>>>>>>>>>>>>>>>>>>>>>> would be delivered. If Capabilities are built on TCM 
>>>>>>>>>>>>>>>>>>>>>>> and I wanted to do Guardrails on TCM too but was 
>>>>>>>>>>>>>>>>>>>>>>> explained it is probably too soon, I guess you would 
>>>>>>>>>>>>>>>>>>>>>>> experience something similar.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Sam's comment is from May and maybe a lot has changed 
>>>>>>>>>>>>>>>>>>>>>>> since in then and his comment is not applicable 
>>>>>>>>>>>>>>>>>>>>>>> anymore. It would be great to know if we could build on 
>>>>>>>>>>>>>>>>>>>>>>> top of the current trunk already or we will wait until 
>>>>>>>>>>>>>>>>>>>>>>> 5.1/6.0 is delivered.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> (1) 
>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-19593?focusedCommentId=17844326&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17844326
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> On Fri, Dec 20, 2024 at 2:17 AM Jordan West 
>>>>>>>>>>>>>>>>>>>>>>> <jorda...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> Firstly, glad to see the support and enthusiasm here 
>>>>>>>>>>>>>>>>>>>>>>>> and in the recent Slack discussion. I think there is 
>>>>>>>>>>>>>>>>>>>>>>>> enough for me to start drafting a CEP.
>>>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>>>> Stefan, global configuration and capabilities do have 
>>>>>>>>>>>>>>>>>>>>>>>> some overlap but not full overlap. For example, you 
>>>>>>>>>>>>>>>>>>>>>>>> may want to set globally that a cluster enables 
>>>>>>>>>>>>>>>>>>>>>>>> feature X or control the threshold for a guardrail but 
>>>>>>>>>>>>>>>>>>>>>>>> you still need to know if all nodes support feature X 
>>>>>>>>>>>>>>>>>>>>>>>> or have that guardrail, the latter is what 
>>>>>>>>>>>>>>>>>>>>>>>> capabilities targets. I do think capabilities are a 
>>>>>>>>>>>>>>>>>>>>>>>> step towards supporting global configuration and the 
>>>>>>>>>>>>>>>>>>>>>>>> work you described is another step (that we could do 
>>>>>>>>>>>>>>>>>>>>>>>> after capabilities or in parallel with them in mind). 
>>>>>>>>>>>>>>>>>>>>>>>> I am also supportive of exploring global configuration 
>>>>>>>>>>>>>>>>>>>>>>>> for the reasons you mentioned. 
>>>>>>>>>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>>>>>>>>>> In terms of how capabilities get propagated across the 
>>>>>>>>>>>>>>>>>>>>>>>> cluster, I hadn't put much thought into it yet past 
>>>>>>>>>>>>>>>>>>>>>>>> likely TCM since this will be a new feature that lands 
>>>>>>>>>>>>>>>>>>>>>>>> after TCM. In Riak, we had gossip (but more mature 
>>>>>>>>>>>>>>>>>>>>>>>> than C*s -- this was an area I contributed to a lot so 
>>>>>>>>>>>>>>>>>>>>>>>> very familiar) to disseminate less critical 
>>>>>>>>>>>>>>>>>>>>>>>> information such as capabilities and a separate layer 
>>>>>>>>>>>>>>>>>>>>>>>> that did TCM. Since we don't have this in C* I don't 
>>>>>>>>>>>>>>>>>>>>>>>> think we would want to build a separate distribution 
>>>>>>>>>>>>>>>>>>>>>>>> channel for capabilities metadata when we already have 
>>>>>>>>>>>>>>>>>>>>>>>> TCM in place. But I plan to explore this more as I 
>>>>>>>>>>>>>>>>>>>>>>>> draft the CEP.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Jordan
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 1:48 PM Štefan Miklošovič 
>>>>>>>>>>>>>>>>>>>>>>>> <smikloso...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jordan,
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> what would this look like from the implementation 
>>>>>>>>>>>>>>>>>>>>>>>>> perspective? I was experimenting with transactional 
>>>>>>>>>>>>>>>>>>>>>>>>> guardrails where an operator would control the 
>>>>>>>>>>>>>>>>>>>>>>>>> content of a virtual table which would be backed by 
>>>>>>>>>>>>>>>>>>>>>>>>> TCM so whatever guardrail we would change, this would 
>>>>>>>>>>>>>>>>>>>>>>>>> be automatically and transparently propagated to 
>>>>>>>>>>>>>>>>>>>>>>>>> every node in a cluster. The POC worked quite nicely. 
>>>>>>>>>>>>>>>>>>>>>>>>> TCM is just a vehicle to commit a change which would 
>>>>>>>>>>>>>>>>>>>>>>>>> spread around and all these settings would survive 
>>>>>>>>>>>>>>>>>>>>>>>>> restarts. We would have the same configuration 
>>>>>>>>>>>>>>>>>>>>>>>>> everywhere which is not currently the case because 
>>>>>>>>>>>>>>>>>>>>>>>>> guardrails are configured per node and if not 
>>>>>>>>>>>>>>>>>>>>>>>>> persisted to yaml, on restart their values would be 
>>>>>>>>>>>>>>>>>>>>>>>>> forgotten.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Guardrails are just an example, what is quite obvious 
>>>>>>>>>>>>>>>>>>>>>>>>> is to expand this idea to the whole configuration in 
>>>>>>>>>>>>>>>>>>>>>>>>> yaml. Of course, not all properties in yaml make 
>>>>>>>>>>>>>>>>>>>>>>>>> sense to be the same cluster-wise (ip addresses etc 
>>>>>>>>>>>>>>>>>>>>>>>>> ...), but the ones which do would be again set 
>>>>>>>>>>>>>>>>>>>>>>>>> everywhere the same way.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> The approach I described above is that we make sure 
>>>>>>>>>>>>>>>>>>>>>>>>> that the configuration is same everywhere, hence 
>>>>>>>>>>>>>>>>>>>>>>>>> there can be no misunderstanding what features this 
>>>>>>>>>>>>>>>>>>>>>>>>> or that node has, if we say that all nodes have to 
>>>>>>>>>>>>>>>>>>>>>>>>> have a particular feature because we said so in TCM 
>>>>>>>>>>>>>>>>>>>>>>>>> log so on restart / replay, a node with "catch up" 
>>>>>>>>>>>>>>>>>>>>>>>>> with whatever features it is asked to turn on.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Your approach seems to be that we distribute what all 
>>>>>>>>>>>>>>>>>>>>>>>>> capabilities / features a cluster supports and that 
>>>>>>>>>>>>>>>>>>>>>>>>> each individual node configures itself in some way or 
>>>>>>>>>>>>>>>>>>>>>>>>> not to comply?
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Is there any intersection in these approaches? At 
>>>>>>>>>>>>>>>>>>>>>>>>> first sight it seems somehow related. How is one 
>>>>>>>>>>>>>>>>>>>>>>>>> different from another from your point of view?
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Regards
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> (1) 
>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-19593
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, Dec 19, 2024 at 12:00 AM Jordan West 
>>>>>>>>>>>>>>>>>>>>>>>>> <jw...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> In a recent discussion on the pains of upgrading one 
>>>>>>>>>>>>>>>>>>>>>>>>>> topic that came up is a feature that Riak had called 
>>>>>>>>>>>>>>>>>>>>>>>>>> Capabilities [1]. A major pain with upgrades is that 
>>>>>>>>>>>>>>>>>>>>>>>>>> each node independently decides when to start using 
>>>>>>>>>>>>>>>>>>>>>>>>>> new or modified functionality. Even when we put this 
>>>>>>>>>>>>>>>>>>>>>>>>>> behind a config (like storage compatibility mode) 
>>>>>>>>>>>>>>>>>>>>>>>>>> each node immediately enables the feature when the 
>>>>>>>>>>>>>>>>>>>>>>>>>> config is changed and the node is restarted. This 
>>>>>>>>>>>>>>>>>>>>>>>>>> causes various types of upgrade pain such as failed 
>>>>>>>>>>>>>>>>>>>>>>>>>> streams and schema disagreement. A recent example of 
>>>>>>>>>>>>>>>>>>>>>>>>>> this is CASSANRA-20118 [2]. In some cases operators 
>>>>>>>>>>>>>>>>>>>>>>>>>> can prevent this from happening through careful 
>>>>>>>>>>>>>>>>>>>>>>>>>> coordination (e.g. ensuring upgrade sstables only 
>>>>>>>>>>>>>>>>>>>>>>>>>> runs after the whole cluster is upgraded) but 
>>>>>>>>>>>>>>>>>>>>>>>>>> typically requires custom code in whatever control 
>>>>>>>>>>>>>>>>>>>>>>>>>> plane the operator is using. A capabilities 
>>>>>>>>>>>>>>>>>>>>>>>>>> framework would distribute the state of what 
>>>>>>>>>>>>>>>>>>>>>>>>>> features each node has (and their status e.g. 
>>>>>>>>>>>>>>>>>>>>>>>>>> enabled or not) so that the cluster can choose to 
>>>>>>>>>>>>>>>>>>>>>>>>>> opt in to new features once the whole cluster has 
>>>>>>>>>>>>>>>>>>>>>>>>>> them available. From experience, having this in Riak 
>>>>>>>>>>>>>>>>>>>>>>>>>> made upgrades a significantly less risky process and 
>>>>>>>>>>>>>>>>>>>>>>>>>> also paved a path towards repeatable downgrades. I 
>>>>>>>>>>>>>>>>>>>>>>>>>> think Cassandra would benefit from it as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>   
>>>>>>>>>>>>>>>>>>>>>>>>>> Further, other tools like analytics could benefit 
>>>>>>>>>>>>>>>>>>>>>>>>>> from having this information since currently it's up 
>>>>>>>>>>>>>>>>>>>>>>>>>> to the operator to manually determine the state of 
>>>>>>>>>>>>>>>>>>>>>>>>>> the cluster in some cases. 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> I am considering drafting a CEP proposal for this 
>>>>>>>>>>>>>>>>>>>>>>>>>> feature but wanted to take the general temperature 
>>>>>>>>>>>>>>>>>>>>>>>>>> of the community and get some early thoughts while 
>>>>>>>>>>>>>>>>>>>>>>>>>> working on the draft. 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to hearing y'alls thoughts,
>>>>>>>>>>>>>>>>>>>>>>>>>> Jordan
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> [1] 
>>>>>>>>>>>>>>>>>>>>>>>>>> https://github.com/basho/riak_core/blob/25d9a6fa917eb8a2e95795d64eb88d7ad384ed88/src/riak_core_capability.erl#L23-L72
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> [2] 
>>>>>>>>>>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/CASSANDRA-20118

Re: Capabilities

Reply via email to